<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>MetaDetox at TextDetox CLEF 2025: Detoxification with Few-Chain Prompting</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sara Bourbour Hosseinbeigi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amin Saeidi Kelishami</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maryam Gheysari</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fatemeh Rahimzadeh</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Engineering Department, Sharif University of Technology</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>IT Engineering Department, School of Industrial and Systems Engineering, Tarbiat Modares University</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Electrical and Computer Engineering, University of Tehran</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Toxic language on social media presents a persistent barrier to safe and inclusive online communication. While traditional approaches to detoxification rely on fine-tuned models or rule-based substitutions, they are often limited by data availability, scalability, and linguistic diversity. In this paper, we (MetaDetox team) present Few-Chain Detox, a multilingual, multi-style detoxification system that achieves top-tier performance in the TextDetox 2025 shared task. Our method eliminates the need for model fine-tuning by leveraging Chain-of-Thought prompting and few-shot learning to guide a powerful multilingual language model (DeepSeek) across 15 languages, including low-resource and code-switched varieties. For each input, we generate multiple stylistically controlled rewrites (mild, neutral, formal), and apply semantic similarity and toxicity classifiers to rerank outputs. Despite using no task-specific training, MetaDetox team ranked second overall in the competition and outperformed all zero-shot baselines. Our results highlight the potential of prompt-based, model-free approaches in multilingual style transfer and controlled text generation.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multilingual detoxification</kwd>
        <kwd>prompt-based generation</kwd>
        <kwd>few-shot learning</kwd>
        <kwd>Chain of thought</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Toxic language on social media remains a pervasive threat to online safety and digital well-being. While
most platforms rely on automatic detection and removal of ofensive content, there is growing interest
in proactive moderation strategies that rewrite toxic messages into neutral alternatives rather than
simply blocking them [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        This task, known as text detoxification , is a form of text style transfer where the source style is toxic
(e.g., profanity, insults), and the target style is neutral or polite. The objective is to eliminate explicit
hate or vulgarity while preserving the original message’s semantic content [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Prior research has shown that addressing explicit toxicity—such as overt slurs and profanities—is
both feasible and critical, as these forms of abuse are widespread across languages. However, most
detoxification research has focused on English, with only limited eforts in languages like Russian,
Spanish, Hindi, and Amharic. This multilingual gap has prompted the organization of shared tasks
aimed at expanding detoxification methods beyond English [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        The Multilingual Text Detoxification (TextDetox) 2025 shared task responds to this need by evaluating
systems that transform toxic text into non-toxic text across 15 typologically diverse languages. Building
on the 2024 edition [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], the 2025 challenge introduces both multilingual and cross-lingual scenarios,
emphasizing transfer from high-resource to low-resource languages [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The focus is explicitly on direct,
explicit toxicity—such as profanity and vulgar insults—rather than implicit forms like sarcasm or coded
hate, making the task more tractable through paraphrasing.
      </p>
      <p>
        Participants are provided with parallel corpora of toxic and detoxified sentences in several
languages [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and systems are evaluated using automatic metrics for style transfer accuracy and content
preservation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], along with human judgment.
      </p>
      <p>In this work, we present Few-Chain Detox1, a novel multilingual detoxification system that ranked
second in the TextDetox 2025 shared task. Unlike traditional fine-tuned or translation-based approaches,
our method relies exclusively on prompt-based generation, requiring no task-specific model updates.
We combine few-shot prompting—with curated task examples—with chain-of-thought reasoning to
guide large language models (LLMs) in identifying and neutralizing explicit toxicity while preserving
meaning. To enhance robustness, we generate multiple candidates per input and apply reranking to
select the most fluent and safe output.</p>
      <p>Despite its simplicity, Few-Chain Detox achieved strong performance across all evaluation metrics
and languages, demonstrating that prompt-based detoxification, combined with lightweight reranking,
is a competitive and scalable solution for multilingual toxic content moderation.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Rule-Based and Lexical Detoxification. Early approaches to detoxification relied on lexical
substitutions or removals, such as masking profanities or dropping toxic words. While efective in reducing
explicit toxicity, these methods often produced disfluent or semantically incomplete outputs [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Dementieva et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] observed that context-free word removal can render sentences unnatural or misleading.
More refined approaches like Delete-Retrieve-Generate [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] aimed to improve fluency by replacing
toxic spans with retrieved neutral phrases. Similarly, unsupervised style transfer methods used
encoder–decoder architectures to disentangle toxic style from content [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. While foundational, these early
techniques lacked the fluency and accuracy of modern neural models.
      </p>
      <p>
        Sequence-to-Sequence Fine-Tuning. The availability of parallel toxic–neutral corpora enabled
supervised detoxification through fine-tuned sequence-to-sequence models. Dale et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and Logacheva
et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] fine-tuned BART [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and T5 [13] for English detoxification, achieving strong results. APPDIA
[14] introduced discourse-aware fine-tuning for conversation-level toxicity. In Russian,
ruT5—finetuned on a dedicated detox corpus—was successful in RUSSE-2022 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Multilingual models like mBART
[15] and mT5 [16] further enabled multi-language detoxification. Rykov et al. [ 17] used a 3.7B-parameter
mT0 model to fine-tune on augmented multilingual data, achieving near-SOTA performance. Novel
architectures have also emerged, such as DifuDetox [ 18], which applies difusion-based generation,
and MaRCo [19] and DExperts [20], which steer outputs using competing models. While fine-tuned
generation is highly efective, it depends on substantial parallel data and compute resources.
      </p>
      <p>Prompt-Based Detoxification and LLMs. Recent advances in large language models have enabled
prompt-based detoxification, which guides frozen models to rewrite toxic content through instructions
or demonstrations. InstructGPT [21] demonstrated reliable prompt-following behavior. GPT-Detox [22]
used few-shot prompting with GPT-3.5 to paraphrase toxic sentences, outperforming some fine-tuned
models. Similarly, Zhang et al. [23] showed that ChatGPT (GPT-4) could detoxify Reddit posts while
identifying toxicity types. Open-source LLMs such as LLaMA [24] and DeepSeek [25] have also been
adapted for detoxification. Luo et al. [ 26] reported that DeepSeek outperformed ChatGPT in Chinese
medical QA, illustrating the potential of non-English LLMs. These findings suggest that prompt-based
methods ofer a scalable and language-flexible alternative to fine-tuning.</p>
      <p>
        Multilingual and Low-Resource Detoxification. Detoxification in diverse languages is hindered
by the scarcity of parallel data. Cross-lingual strategies address this by training on high-resource
languages and transferring to low-resource ones. Dementieva et al. [27] showed that combining English
training data with machine-translated outputs boosts performance in other languages. Teams in the
PAN@CLEF 2024 shared task used synthetic data via translation to fine-tune multilingual models
[17]. Pretrained models like mT5 [16], mT0 [28], and translated ParaDetox data [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] enabled zero-shot
generalization to unseen languages. New datasets for low-resource languages—e.g., Amharic by Ayele et
      </p>
      <sec id="sec-2-1">
        <title>1The dataset &amp; codes are available at - https://github.com/Amin-Saeidi/FewChain_Detox</title>
        <p>
          al. [29]—further support multilingual detoxification research. However, Mukherjee [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] notes that with
only a few thousand pairs, seq2seq models struggle to maintain semantic fidelity, making low-resource
detoxification an ongoing challenge.
        </p>
        <p>
          Reranking and Generation Selection. Detoxification systems often generate multiple rewrites,
which vary in fluency and toxicity. Reranking strategies aim to select the best output from these
candidates. Holtzman et al. [30] used decoding diversity to reduce toxicity, while DExperts [20] selected
tokens by balancing toxic and anti-toxic LMs. In detoxification, systems like XDetox [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] sample
paraphrases and use classifiers to select the least toxic, semantically faithful version. Hallinan et al. [ 19]
apply expert-pair revisions iteratively to steer generation. Such reranking pipelines, which follow a
generate-then-select approach, significantly improve detox quality and motivate our use of reranking
in Few-Chain Detox.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>We present a multilingual detoxification framework based on Chain-of-Thought (CoT) prompting,
few-shot learning, and style-controlled generation. Rather than fine-tuning a separate model for each
language, our approach leverages the inherent generalization capabilities of large multilingual language
models (specifically, DeepSeek), combined with prompt design and reranking.</p>
      <p>Our system supports 15 languages, organized as follows:
• High-resource languages (9): Languages with gold-standard toxic–non-toxic training data.
• Low-resource languages (6): Languages without parallel training data. For these, we used
ChatGPT-4o [31] to generate polite rephrasings. To ensure quality, we applied multiple verification
prompts and filtered the outputs using an automatic toxicity classifier.</p>
      <p>Each toxic input is transformed into three stylistically distinct detoxified outputs: mild, neutral, and
formal. An overview of the full system architecture is shown in Figure 1a.</p>
      <sec id="sec-3-1">
        <title>3.1. Prompting Strategy and Example Construction</title>
        <p>We designed a CoT-style prompt template that decomposes the detoxification process into three steps:</p>
        <sec id="sec-3-1-1">
          <title>1. Identify toxic spans in the input.</title>
          <p>2. Explain why the spans are toxic.
3. Generate detoxified alternatives in mild, neutral, and formal styles.</p>
          <p>Each prompt is supplemented with few-shot examples tailored to each language:
• For high-resource languages, we used gold-standard examples from the training set (8–12 per
prompt).
• For low-resource languages, we constructed synthetic few-shot examples using ChatGPT-4o
(12–16 per prompt), formatted in the same CoT style. These were filtered using a multilingual
toxicity classifier 2 to ensure quality and reduce noise.</p>
          <p>Few-shot examples for prompting were selected randomly. We sampled batches uniformly from the
synthetically created pool, ensuring diversity across toxic expressions and sentence lengths. We tested
several prompt batches per language and selected the best-performing batch based on internal scores
for toxicity suppression and semantic similarity. The preparation process for few-shot examples is
summarized in Figure 1b.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>2https://huggingface.co/textdetox/xlmr-large-toxicity-classifier-v2</title>
          <p>(a) Few-Chain detoxification method pipeline.
(b) Few-shot prompt preparation for high-resource and
low-resource languages.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Multi-Level Generation and Inference with LLM</title>
        <p>For each toxic sentence, the prompt instructed the model to generate two outputs per style level:
• Mild: Informal and softened tone
• Neutral: Standard and conversational tone
• Formal: Polished and professional tone</p>
        <p>We used the DeepSeek API3, a state-of-the-art multilingual language model, for generation. Each input
produced five candidates per style level, resulting in 15 detoxified outputs per input. This generation
step is illustrated in the full method pipeline (Figure 1a).</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Reranking and Output Selection</title>
        <p>To ensure diversity and control, we prompted the model to generate five detoxified outputs per style
level (mild, neutral, formal) for each toxic input, yielding 15 candidates per input sentence. We then
applied level-wise reranking to each group of five using two criteria:
• Semantic similarity, computed via cosine similarity between LaBSE embeddings [32], to
preserve original meaning.</p>
        <p>• Toxicity score, computed using a multilingual classifier 4, to ensure the output was non-toxic.
3All prompts were executed via the oficial DeepSeek API using their default multilingual base model at the time of writing.
4https://huggingface.co/textdetox/xlmr-large-toxicity-classifier-v2</p>
        <p>During development, we evaluated multiple batches of few-shot examples and hyperparameters per
language. These batches were scored internally using the same similarity and toxicity metrics, and the
highest-performing batch was retained for final prompting.</p>
        <p>Finally, we submitted all three detoxified variants (mild, neutral, formal) per input to the competition’s
evaluation system. The best-performing style level for each sentence—based on the oficial joint metric
(STA × SIM × FL)—was then selected as our final output and submission. The overall automatic
evaluation process is illustrated in Figure 2.</p>
        <p>The oficial joint metric used for evaluation combines three components:
• Style Transfer Accuracy (STA): Toxicity classification of the output using a toxicity classifier.
• Semantic Similarity (SIM): Cosine similarity between LaBSE embeddings of the input and
generated sentence.
• Fluency (FL): A fluency estimate based on the generated sentence’s adequacy and its resemblance
to human-written detoxified references.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>We evaluated our system, Few-Chain Detox, on the oficial multilingual test set provided in the TextDetox
2025 shared task, which includes 15 typologically diverse languages. The results are compared against
several strong baselines, including fine-tuned multilingual models (e.g., mT0), large proprietary LLMs
(GPT-4, GPT-4o), lightweight models (o3-mini), and unsupervised methods (backtranslation, delete,
duplicate).</p>
      <p>Table 1 summarizes the joint metric scores (STA × SIM × FL) for both parallel (AvgP)
andnonparallel (AvgNP) settings. MetaDetox team ranked 2nd overall, achieving the highest score in multiple
languages, such as Spanish (es) and Arabic (ar), and top-3 placements in over 10 languages. Our method
significantly outperformed all prompting-based baselines (e.g., GPT-4, GPT-4o, o3-mini) and nearly
all fine-tuned models in non-English languages, especially in low-resource settings like Hebrew (he),
and Hindi (hin). These results validate the efectiveness of our few-shot CoT prompting and reranking
strategy across diverse linguistic and resource contexts.</p>
      <p>To evaluate the contribution of individual components in our pipeline, we conducted a small-scale
analysis focusing on two aspects: (1) the sensitivity of the model to diferent few-shot prompt batches,
and (2) the impact of reranking on output quality.
• Few-Shot Prompt Sensitivity. We tested five randomly sampled batches of few-shot examples
per language, each containing diverse toxic expressions and sentence lengths. The joint score
varied by up to ±0.05 across batches, with the best-performing batch selected for final
submission. This suggests that while prompt selection does influence performance, the model remains
relatively robust to variation in example composition.
• Reranking Impact. We compared the performance of our system with and without reranking
on a 100-sentence English subset. Without reranking, the average joint score dropped from 0.742
to 0.681, primarily due to increased toxicity and reduced fluency. This confirms that reranking
plays a critical role in selecting safe, fluent, and semantically faithful outputs.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>This paper introduced Few-Chain Detox, a prompt-based multilingual detoxification system that
participated in the TextDetox 2025 shared task and achieved a top-2 overall rank. Our approach
diverged from traditional fine-tuning pipelines and instead employed a strategically crafted
promptdriven generation framework built upon Chain-of-Thought (CoT) reasoning, few-shot examples, and
style-aware conditioning. By generating multiple stylistic variants per input and leveraging
LaBSEbased reranking with toxicity filtering, we were able to submit clean, fluent, and semantically faithful
detoxifications across 15 diverse languages.</p>
      <p>Few-Chain Detox showed competitive or superior performance to several strong baselines, including
ifne-tuned multilingual Transformers like mT0, as well as zero-shot prompting systems like GPT-4
and GPT-4o. Our system was particularly efective for languages where training data is scarce or
non-existent—demonstrating the adaptability of few-shot CoT prompting for cross-lingual
generalization. The pipeline required no parameter updates or additional model training, highlighting its
cost-efectiveness and potential scalability to unseen languages or domains. The key contributions of
our work include:
• A generalizable, training-free framework for multilingual detoxification based on prompt
engineering and candidate reranking.
• A novel style-controlled generation paradigm producing mild, neutral, and formal rewrites.
• Language-specific few-shot CoT prompting strategies for both high- and low-resource settings.
• Empirical validation of the reranking approach using semantic similarity and toxicity classifiers.</p>
      <p>Several directions could extend this work. Cross-lingual CoT prompting, where demonstrations in one
language are reused across related languages via translation or multilingual embeddings, may reduce the
need for language-specific prompt engineering. Incorporating fluency-aware reranking models—such
as xCOMET or GPT-based evaluators—could further enhance the naturalness and readability of outputs.
Another important direction is addressing implicit and context-dependent toxicity, including sarcasm,
microaggressions, and stereotype-based language, which remain challenging even for advanced language
models. Few-Chain Detox illustrates how prompt-based generation and intelligent output selection
can enable scalable, interpretable detoxification across languages—supporting safer and more inclusive
online communication.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used GPT-4 and Microsoft Copilot in order to: Drafting
content, Generate literature review, Paraphrase and reword, Abstract Generation, Grammar and spelling
check, and Peer review simulation. After using these tools, the authors reviewed and edited the content
as needed and take full responsibility for the publication’s content.
Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation,
and comprehension, arXiv preprint arXiv:1910.13461 (2019).
[13] C. Rafel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu, Exploring
the limits of transfer learning with a unified text-to-text transformer, Journal of machine learning
research 21 (2020) 1–67.
[14] K. Atwell, S. Hassan, M. Alikhani, Appdia: A discourse-aware transformer-based style transfer
model for ofensive social media conversations, arXiv preprint arXiv:2209.08207 (2022).
[15] Y. Liu, J. Gu, N. Goyal, X. Li, S. Edunov, M. Ghazvininejad, M. Lewis, L. Zettlemoyer,
Multilingual denoising pre-training for neural machine translation, Transactions of the Association for
Computational Linguistics 8 (2020) 726–742.
[16] L. Xue, N. Constant, A. Roberts, M. Kale, R. Al-Rfou, A. Siddhant, A. Barua, C. Rafel, mt5: A
massively multilingual pre-trained text-to-text transformer, arXiv preprint arXiv:2010.11934
(2020).
[17] E. Rykov, K. Zaytsev, I. Anisimov, A. Voronin, Smurfcat at pan 2024 textdetox: Alignment of
multilingual transformers for text detoxification, arXiv preprint arXiv:2407.05449 (2024).
[18] G. Floto, M. M. A. Pour, P. Farinneya, Z. Tang, A. Pesaranghader, M. Bharadwaj, S. Sanner,
Difudetox: A mixed difusion model for text detoxification, arXiv preprint arXiv:2306.08505
(2023).
[19] S. Hallinan, A. Liu, Y. Choi, M. Sap, Detoxifying text with marco: Controllable revision with
experts and anti-experts, arXiv preprint arXiv:2212.10543 (2022).
[20] A. Liu, M. Sap, X. Lu, S. Swayamdipta, C. Bhagavatula, N. A. Smith, Y. Choi, Dexperts:
Decodingtime controlled text generation with experts and anti-experts, arXiv preprint arXiv:2105.03023
(2021).
[21] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama,
A. Ray, et al., Training language models to follow instructions with human feedback, Advances in
neural information processing systems 35 (2022) 27730–27744.
[22] A. Pesaranghader, N. Verma, M. Bharadwaj, Gpt-detox: An in-context learning-based paraphraser
for text detoxification, in: 2023 International Conference on Machine Learning and Applications
(ICMLA), IEEE, 2023, pp. 1528–1534.
[23] B. Zhang, X. Shen, W. M. Si, Z. Sha, Z. Chen, A. Salem, Y. Shen, M. Backes, Y. Zhang, Comprehensive
assessment of toxicity in chatgpt, arXiv preprint arXiv:2311.14685 (2023).
[24] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal,
E. Hambro, F. Azhar, et al., Llama: Open and eficient foundation language models, arXiv preprint
arXiv:2302.13971 (2023).
[25] DeepSeek-AI, D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, et al., Deepseek-r1: Incentivizing
reasoning capability in llms via reinforcement learning, 2025. URL: https://arxiv.org/abs/2501.12948.
arXiv:2501.12948.
[26] P.-W. Luo, J.-W. Liu, X. Xie, J.-W. Jiang, X.-Y. Huo, Z.-L. Chen, Z.-C. Huang, S.-Q. Jiang, M.-Q.</p>
      <p>Li, Deepseek vs chatgpt: a comparison study of their performance in answering prostate cancer
radiotherapy questions in multiple languages, American Journal of Clinical and Experimental
Urology 13 (2025) 176.
[27] D. Dementieva, D. Moskovskiy, D. Dale, A. Panchenko, Exploring methods for cross-lingual text
style transfer: The case of text detoxification, arXiv preprint arXiv:2311.13937 (2023).
[28] N. Muennighof, T. Wang, L. Sutawika, A. Roberts, S. Biderman, T. L. Scao, M. S. Bari, S. Shen,
Z.-X. Yong, H. Schoelkopf, et al., Crosslingual generalization through multitask finetuning, arXiv
preprint arXiv:2211.01786 (2022).
[29] A. A. Ayele, S. M. Yimam, T. D. Belay, T. Asfaw, C. Biemann, Exploring amharic hate speech data
collection and classification approaches, in: Proceedings of the 14th International Conference on
Recent Advances in Natural Language Processing, 2023, pp. 49–59.
[30] A. Holtzman, J. Buys, L. Du, M. Forbes, Y. Choi, The curious case of neural text degeneration,
arXiv preprint arXiv:1904.09751 (2019).
[31] OpenAI, Gpt-4o, https://openai.com/chatgpt, 2024. Large language model.</p>
    </sec>
    <sec id="sec-7">
      <title>Appendix A: Prompt Template for Detoxification</title>
      <p>The following is the prompt template used to guide the DeepSeek model in detoxifying toxic
sentences across multiple languages. The prompt includes clear instructions and a few-shot example to
demonstrate the detoxification process.</p>
      <sec id="sec-7-1">
        <title>Prompt Template</title>
        <p>You are a helpful assistant trained to make toxic or ofensive sentences more polite and respectful while
keeping their original meaning. Sometimes the sentences are in diferent languages. Actually, you are
given a sentence in a specific language, and your task is to perform the following steps:
1. Identify and remove toxic words.
2. Improve grammar if needed. After removing the toxic words, replace them with polite alternatives and
check the sentence’s grammar. Make only minimal changes necessary to correct grammatical errors and
improve sentence structure.
3. Ensure the final result preserves the original sentence structure and wording as much as possible.
Your task is to:
• Detoxify the following sentence.
• Give priority to removing only the ofensive or profane words while preserving the original
sentence structure as much as possible.
• Preserve the original meaning and tone.
• ONLY return the detoxified version inside square brackets, like this: [Detoxified sentence]
• Do NOT include any language tags like &lt;en&gt;, &lt;fa&gt;, etc.</p>
        <p>• Do NOT add any explanation or commentary.</p>
        <p>Below are some examples of how to detoxify: {shot_prompt}</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Appendix B: Sample Few-Shot Prompt (English)</title>
      <p>Below is a sample of {shot_prompt} used in the prompt for English detoxification. Each example
includes the original toxic sentence, intermediate steps, and the final detoxified output.</p>
      <sec id="sec-8-1">
        <title>Few-Shot Example (English)</title>
        <p>Toxic Sentence: You’re such a stupid idiot, nobody wants to hear your crap.</p>
        <p>Step 1 - Remove toxic words: You’re such a ____ ____, nobody wants to hear your ____.
Step 2 - Improve grammar: You’re such a rude person, nobody wants to hear you.
Step 3 - Preserving original structure: You’re such a rude person, nobody wants to hear your opinion.
Final Output: [You’re such a rude person, nobody wants to hear your opinion.]</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gipp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Greiner-Petter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          , et al.,
          <source>Overview of pan</source>
          <year>2025</year>
          :
          <article-title>Generative ai detection, multilingual text detoxification, multi-author writing style analysis, and generative plagiarism detection</article-title>
          ,
          <source>in: European Conference on Information Retrieval</source>
          , Springer,
          <year>2025</year>
          , pp.
          <fpage>434</fpage>
          -
          <lpage>441</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sourabrata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Akanksha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. O.</given-names>
            <surname>Atul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>John</surname>
          </string-name>
          , D. Ondrej,
          <article-title>Text detoxification as style transfer in english and hindi</article-title>
          ,
          <source>in: Proceedings of the 20th International Conference on Natural Language Processing (ICON)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          , et al.,
          <source>Overview of the multilingual text detoxification task at pan</source>
          <year>2024</year>
          , Working Notes of CLEF (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korencic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-author writing style analysis, multilingual text detoxification, oppositional thinking analysis, and generative AI authorship verification - extended abstract</article-title>
          , in: N.
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Tonellotto</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lipani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
          </string-name>
          , I. Ounis (Eds.),
          <source>Advances in Information Retrieval - 46th European Conference on Information Retrieval</source>
          ,
          <string-name>
            <surname>ECIR</surname>
          </string-name>
          <year>2024</year>
          , Glasgow, UK, March
          <volume>24</volume>
          -28,
          <year>2024</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>VI</given-names>
          </string-name>
          , volume
          <volume>14613</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2024</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>10</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -56072-
          <issue>9</issue>
          _1. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -56072-9\_1.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ronen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          , E. Kaufman, A.
          <string-name>
            <surname>Elnagar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mukherjee</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Multilingual and explainable text detoxification with parallel corpora</article-title>
          , in: O.
          <string-name>
            <surname>Rambow</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Wanner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Apidianaki</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Al-Khalifa</surname>
            ,
            <given-names>B. D.</given-names>
          </string-name>
          <string-name>
            <surname>Eugenio</surname>
          </string-name>
          , S. Schockaert (Eds.),
          <source>Proceedings of the 31st International Conference on Computational Linguistics</source>
          , Association for Computational Linguistics, Abu Dhabi,
          <string-name>
            <surname>UAE</surname>
          </string-name>
          ,
          <year>2025</year>
          , pp.
          <fpage>7998</fpage>
          -
          <lpage>8025</lpage>
          . URL: https://aclanthology.org/
          <year>2025</year>
          .coling-main.
          <volume>535</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Choi</surname>
          </string-name>
          , Xdetox:
          <article-title>Text detoxification with token-level toxicity explanations</article-title>
          ,
          <source>in: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>15215</fpage>
          -
          <lpage>15226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Logacheva</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Nikishina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fenogenova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dale</surname>
          </string-name>
          , I. Krotova,
          <string-name>
            <given-names>N.</given-names>
            <surname>Semenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Shavrina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          , Russe-2022:
          <article-title>Findings of the first russian detoxification shared task based on parallel corpora, Computational Linguistics</article-title>
          and Intellectual
          <string-name>
            <surname>Technologies</surname>
          </string-name>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          , Delete, retrieve, generate
          <article-title>: a simple approach to sentiment and style transfer</article-title>
          , arXiv preprint arXiv:
          <year>1804</year>
          .
          <volume>06437</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C. N. d.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Melnyk</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Padhi</surname>
          </string-name>
          ,
          <article-title>Fighting ofensive language on social media with unsupervised text style transfer</article-title>
          , arXiv preprint arXiv:
          <year>1805</year>
          .
          <volume>07685</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Voronov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Logacheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Kozlova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Semenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Text detoxification using large pre-trained neural models (</article-title>
          <year>2021</year>
          ),
          <source>arXiv preprint arXiv:2109.08914</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>V.</given-names>
            <surname>Logacheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ustyantsev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dale</surname>
          </string-name>
          , I. Krotova,
          <string-name>
            <given-names>N.</given-names>
            <surname>Semenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Paradetox: Detoxification with parallel data, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics</article-title>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>6804</fpage>
          -
          <lpage>6818</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghazvininejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , L. Zettlemoyer, [32]
          <string-name>
            <given-names>F.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Arivazhagan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Language-agnostic bert sentence embedding</article-title>
          , arXiv preprint arXiv:
          <year>2007</year>
          .
          <year>01852</year>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>