<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Multilingual Text Detoxification Using Google Cloud Translation and Post-Processing Notebook for PAN at CLEF 2024</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Zhongyu</forename><surname>Luo</surname></persName>
							<email>luozhongyu2799@163.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Foshan University</orgName>
								<address>
									<settlement>Foshan</settlement>
									<region>Guangdong</region>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Man</forename><surname>Luo</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Guangzhou City University of Technology</orgName>
								<address>
									<settlement>Guangzhou</settlement>
									<region>Guangdong</region>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Aiguo</forename><surname>Wang</surname></persName>
							<email>wangaiguo2546@163.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Foshan University</orgName>
								<address>
									<settlement>Foshan</settlement>
									<region>Guangdong</region>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Multilingual Text Detoxification Using Google Cloud Translation and Post-Processing Notebook for PAN at CLEF 2024</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">CFB5E42E190791E9494420910E6DDCD1</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:00+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Multilingual Text Detoxification</term>
					<term>Text Detoxification</term>
					<term>Google Cloud Translation</term>
					<term>Minor Language</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The task of text detoxification aims to re-write toxic text into non-toxic text. Though existing methods have achieved impressive detoxification performance in monolingual settings, multilingual text detoxification remains challenging due to the complexity of natural languages and the lack of sufficient data for training accurate models for minor languages. In this study, we propose a cross-lingual text detoxification model, named GCTP, utilizing Google Cloud Translation and post-processing for the PAN@CLEF 2024 multilingual text detoxification task. Specifically, GCTP first translates minor language text into English for detoxification with a pretrained English model, and then translates it back to remove toxic keywords with predefined dictionaries of the original language. Extensive comparative experiments on competition datasets show that GCTP achieves the highest J score for Amharic text and ranks the 5 th place for Chinese text, demonstrating the effectiveness of GCTP.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>With the widespread use and development of social networks, forums, and online communication platforms, the prevalence of toxic language has significantly increased, necessitating effective methods to mitigate offensive content. Text detoxification involves modifying certain attributes of the text while preserving its semantic content. For the PAN@CLEF 2024 multilingual text detoxification task, the goal is to rewrite toxic text into non-toxic text while maintaining the main content as much as possible <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>. Though some methods have achieved promising results in monolingual settings <ref type="bibr" target="#b2">[3]</ref>, the semantic differences between languages make cross-lingual detoxification a challenging problem <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref>. Figure <ref type="figure" target="#fig_0">1</ref> lists examples of detoxification.</p><p>Existing detoxification methods mainly use predefined rules to delete toxic words or phrases <ref type="bibr" target="#b5">[6]</ref>. However, these approaches often fail to consider the overall meaning of a sentence and produce unnatural outputs <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8]</ref>. Accordingly, researchers have explored deep learning-based text detoxification models to address these issues. For example, Pletenev et al. proposed autoregressive and non-autoregressive models to detoxify Russian text, treating detoxification as a specific case of text style transfer. Their approach uses an automatic post-editing algorithm to correct toxic text <ref type="bibr" target="#b8">[9]</ref>. Totmina et al. used the pre-trained GPT3 to detoxify Russian text, highlighting the importance of filtering training data and fine-tuning models for specific languages <ref type="bibr" target="#b9">[10]</ref>. Besides the exploration of English and Russian text <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref>, Dementieva et al. explored the potential of large multilingual models (e.g., mBART and mT5) in cross-lingual text style transfer <ref type="bibr" target="#b12">[13]</ref>. Their work emphasizes the multilingual and cross-lingual detoxification challenges and limitations. Motivated by previous studies, we propose a cross-lingual text detoxification model, named GCTP, to tackle the PAN@CLEF 2024 multilingual text detoxification task.</p><p>The contributions of our work are as follows: (1) Motivated by the transfer learning paradigm, we use Google Cloud Translation API to translate minor language text into English text and use a pretrained English model (i.e., Bart-base-detox in the task) to detoxify text at the semantic level. The processed text is then translated into the minor language, followed by the post-processing that deletes toxic text with a dictionary. ( <ref type="formula">2</ref>) Comparative experiments are conducted on the competition datasets and compared with three baseline detoxification methods. Experimental results achieves the highest J score for Amharic text in the task.</p><p>The structure of this paper is as follows. Section 2 details the proposed model text detoxification. Section 3 describes the datasets and preprocessing steps , outlines the experimental setup and baseline methods, and gives results and detoxification examples. Section 4 concludes this study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Methodology</head><p>To detect toxic text at the semantic level, a large language model is preferred and hence in this study, we propose a cross-lingual text detoxification model. Specifically, if a text detoxification model is available for a language (e.g., English and Russian in the contest), we first use the detoxification model to rewrite toxic text into non-toxic text and then delete toxic text with predefined toxic keywords. In this study, we use the Bart-base-detox model and the ruT5-base-detox model to detoxify English and Russian text, respectively.</p><p>For minor languages (e.g., Spanish, German, Chinese, Arabic, Hindi, Ukrainian, and Amharic in the contest), since there is no corresponding detoxification model, GCTP leverages the translation and detoxification capabilities of existing tools to detoxify multilingual text. Specifically, GCTP mainly consists of four steps: Translation, Detoxification, Backtranslation, and Post-processing, as shown in Figure <ref type="figure" target="#fig_1">2</ref>.</p><p>(1) Translation. Translate the minor language text (i.e., target domain) into English (i.e., source domain) to utilize the powerful detoxification models available for the English language. Google Cloud Translation API is used to convert the minor language text into English. (2) Detoxification. Detoxify the translated English text using a pretrained model specialized in removing toxic content. The Bart-base-detox model, a pretrained English detoxification model, is used to handle the translated English text. Bart-base-detox model can neutralize toxic language while preserving the semantic meaning of the original text. Its detoxification process involves the following steps: (a) input processing: the translated English text is tokenized and fed into the Bart-base-detox model. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experimental Setup and Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Datasets</head><p>The experimental dataset is provided by the organizers and contains a corpus of toxic sentences in nine different languages (i.e., English (en), Spanish (es), German (de), Chinese (zh), Arabic (ar), Hindi (hi), Ukrainian (uk), Russian (ru), and Amharic (am)). These datasets are divided into Development and Test phases. The former contains 400 pairs for each of the 9 languages and the latter includes 600 toxic sentences per language. Table <ref type="table">1</ref> presents a summary of the datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1 Experimental Dataset Statistics</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dataset en es de zh ar hi uk ru am</head><p>Development 400 400 400 400 400 400 400 400 400 Test 600 600 600 600 600 600 600 600 600</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Experimental Setup</head><p>For the provided test data, having 5400 toxic sentences across 9 different languages, the column "neu-tral_sentence" is initially empty and the corresponding task is to fill it with the predicted detoxification text. For each detoxification text, we create a JSON object with two keys ("id" and "text"), where the former indicates the sequence number of the toxic sentence and the latter represents the detoxification sentence. The detoxification text is then filled into the "neutral_sentence" column.</p><p>Fine-tuning involves adjusting model hyperparameters to improve its performance on specific tasks. For the mT5 and Bart models, we fine-tune them with the training data provided for the multilingual text detoxification task. For the learning rate, we experiment with learning rates ranging from 1e-5 to 5e-4. The optimal learning rate for mT5 is 3e-5 and the optimal learning rate for Bart is 2e-5. For batch size, we test batch sizes of 8, 16, 32, and 64. mT5 achieves better trade-off between memory usage and training efficiency with a batch size of 32. Considering GPU memory constraints, a batch size of 16 is used for Bart.</p><p>The J score, referred to as the Joint metric, is used to evaluate the performance of the text detoxification model. J score combines three components: Style Transfer Accuracy (STA), Content Preservation (SIM), and Fluency (FL), and is calculated as the mean of the product of these three components per sample. As shown in equation <ref type="bibr" target="#b0">(1)</ref>. For the three metrics, STA measures the level of non-toxicity of the generated paraphrase, SIM evaluates the similarity of content between the original and detoxified text, and FL estimates the adequacy of the text and its similarity to human-written detoxified references.</p><formula xml:id="formula_0">𝐽 = 1 𝑁 𝑁 ∑︁ 𝑖=1 STA(𝑥 𝑖 ) • SIM(𝑥 𝑖 ) • FL(𝑥 𝑖 )<label>(1)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Baseline</head><p>The proposed model is compared three baseline methods, which are provided by the competition organizer.</p><p>Deletion. It eliminates toxic keywords based on a predefined dictionary for each language. Backtranslation. It first utilizes the NLLB-600M model to translate the minor language text, then uses the English bart-base-detox model to detoxify the text, and finally translates it back to original language.</p><p>mT5. It is a multilingual version of T5, a text-to-text transformer model trained on over 100 languages for various downstream tasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Results</head><p>We conduct comparative experiments using the competition datasets provided by PAN@CLEF 2024. Figure <ref type="figure" target="#fig_2">3</ref> provides examples of detoxified sentences for the nine languages and Table <ref type="table" target="#tab_0">2</ref> presents the human evaluation results of the multilingual and cross-lingual experiments in the testing phase. The first column gives different detoxification models, and the remaining nine columns represent different languages. For each language, the best result is highlighted in bold. From Table <ref type="table" target="#tab_0">2</ref>, we can observe that GCTP achieves the highest J score for Amharic text and ranked the 5 th place for Chinese text. This partially demonstrates the effectiveness of GTCP. Besides, the simple Deletion method performs well in the Arabic, Hindi, and Ukrainian settings, which is possibly due to the lack of sufficient translation data.</p><p>The average scores in Table <ref type="table" target="#tab_0">2</ref> highlight that GTCP performs well across multiple languages but faces challenges with certain languages such as Hindi. In the Hindi results, the detoxified text contains errors where offensive terms are not correctly removed or neutralized. This issue highlights the need for more effective preprocessing and post-processing steps to ensure the removal of toxic content.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>Text detoxification is a challenging task, especially for small languages where there is no sufficient data in training a powerful detoxification model. To this end, we in this study propose a detoxification model that utilizes the Google Cloud Translation and post-processing. Specifically, motivated by the transfer learning, we first use Google Cloud Translation to transfer the target domain (i.e., minor language text) into source domain (i.e., English language text) and then perform detoxification. Afterwards, the text is translated back into minor language text and post-processed using predefined toxic keywords. Finally, the proposed model is compared with three baseline methods on the competition datasets. Results show its effectiveness.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Examples of Detoxification Results</figDesc><graphic coords="2,72.00,65.61,451.28,178.14" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Transfer Learning Enabled Cross-Lingual Text Detoxification Model</figDesc><graphic coords="3,72.00,65.61,451.27,145.81" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Examples of Detoxified Sentences for Different Languages</figDesc><graphic coords="5,72.00,65.61,451.28,174.17" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2</head><label>2</label><figDesc>Manual Evaluation Results for 100 Randomly Selected Texts per Language * indicates the detoxified text for Hindi is incorrect.</figDesc><table><row><cell>Model</cell><cell>en es</cell><cell>de</cell><cell>zh</cell><cell>ar</cell><cell>hi*</cell><cell>uk</cell><cell>ru</cell><cell cols="2">am average</cell></row><row><cell>Deletion</cell><cell cols="8">0.47 0.55 0.57 0.43 0.65 0.65 0.60 0.49 0.63</cell><cell>0.56</cell></row><row><cell>mT5</cell><cell cols="8">0.68 0.47 0.64 0.43 0.63 0.60 0.42 0.40 0.61</cell><cell>0.54</cell></row><row><cell cols="9">Backtranslation 0.73 0.56 0.34 0.34 0.42 0.33 0.23 0.22 0.54</cell><cell>0.41</cell></row><row><cell cols="9">GTCP (Ours) 0.73 0.52 0.01 0.56 0.49 *0.49 0.42 0.68 0.72</cell><cell>0.51</cell></row><row><cell></cell><cell>Note:</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2112.06412</idno>
		<title level="m">A survey of toxic comment classification methods</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bevendorff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><forename type="middle">B</forename><surname>Casals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Elnagar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Freitag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fröbe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Korenčić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mayerl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mukherjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Smirnova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Stamatatos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Taulé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ustalov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Zangerle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF 2024)</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<meeting><address><addrLine>Berlin Heidelberg New York</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Overview of the multilingual text detoxification task at pan 2024</title>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Moskovskiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Babakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Ayele</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Rizwan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Schneider</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Yimam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ustalov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Stakovskii</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Smirnova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Elnagar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mukherjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF 2024 -Conference and Labs of the Evaluation Forum</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Galuščáková</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">G S</forename><surname>Herrera</surname></persName>
		</editor>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A study on manual and automatic evaluation for text style transfer: The case of detoxification</title>
		<author>
			<persName><forename type="first">V</forename><surname>Logacheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Krotova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fenogenova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Nikishina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Shavrina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval)</title>
				<meeting>the 2nd Workshop on Human Evaluation of NLP Systems (HumEval)</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="90" to="101" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Exploring cross-lingual text detoxification with large multilingual language models</title>
		<author>
			<persName><forename type="first">D</forename><surname>Moskovskiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop</title>
				<meeting>the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="346" to="354" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Paradetox: Detoxification with parallel data</title>
		<author>
			<persName><forename type="first">V</forename><surname>Logacheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ustyantsev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Moskovskiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Krotova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Semenov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 60th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="6804" to="6818" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Mukherjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bansal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Ojha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Mccrae</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Dušek</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2402.07767</idno>
		<title level="m">Text detoxification as style transfer in english and hindi</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Dale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Logacheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Kozlova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Semenov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2109.08914</idno>
		<title level="m">Text detoxification using large pre-trained neural models</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Pletenev</surname></persName>
		</author>
		<title level="m">Between denoising and translation: Experiments in text detoxification</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Detoxification of russian texts based on combination of controlled generation using pretrained rugpt3 and the delete method</title>
		<author>
			<persName><forename type="first">E</forename><surname>Totmina</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference &quot;Dialogue</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page" from="1167" to="1174" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Methods for detoxification of texts for the russian language</title>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Moskovskiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Logacheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Kozlova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Semenov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Multimodal Technologies and Interaction</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page">54</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Logacheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Nikishina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fenogenova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Krotova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Semenov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Shavrina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
		<title level="m">Findings of the first russian detoxification shared task based on parallel corpora</title>
				<imprint>
			<date type="published" when="2022">2022. 2022</date>
		</imprint>
	</monogr>
	<note>Russe-</note>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Moskovskiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2311.13937</idno>
		<title level="m">Exploring methods for cross-lingual text style transfer: The case of text detoxification</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
