<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Team cake at TextDetox CLEF 2025/Multilingual Text Detoxification 2025: A Multilingual Text Detoxification Method Based on Chain-of-Thoughts Prompting Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jiangao Peng</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kaiyin Sun</string-name>
          <email>sunkaiyin123@163.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kaichuan Lin</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhankeng Liang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhongyuan Han</string-name>
          <email>hanzhongyuan@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Foshan No.3 Middle School</institution>
          ,
          <addr-line>Foshan</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Foshan University</institution>
          ,
          <addr-line>Foshan</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>This paper presents a multilingual text detoxification method based on a chain-of-thoughts prompting approach for the PAN at CLEF 2025. Multilingual text detoxification aims to transform toxic texts into neutral versions while preserving the original meaning and grammatical structure. Our method leverages large language models (LLMs) to classify toxic sentences and detoxify them through carefully designed prompts. We evaluate our approach on the PAN 2025 multilingual text detoxification task, demonstrating its potential and stability in handling various languages.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;PAN 2025</kwd>
        <kwd>Text Detoxification</kwd>
        <kwd>Large Language Models</kwd>
        <kwd>Chain-of-Thoughts Prompting</kwd>
        <kwd>Text Classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Method</title>
      <p>Our method consists of three key stages: 1) Extracting toxic text features; 2) Classifying toxic text by
features; 3) Text detoxification via text features and classification. All three steps are implemented
through carefully designed prompts fed into the large language model.</p>
      <sec id="sec-2-1">
        <title>2.1. Extracting Toxic Text Features</title>
        <p>
          In this stage, we adopt a similar approach to that proposed by Dementieva et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], leveraging the
capabilities of LLMs to meticulously extract text features from the input sentences. The prompt provided
to the LLM is designed to elicit a comprehensive analysis of the text, focusing on identifying elements
of toxicity and categorizing various aspects of the sentence structure and semantics. Figure 1 illustrates
the structure of this prompt.
        </p>
        <p>Please analyze the provided sentence using the structure below to identify elements of toxicity and suggest improvements, when I
tell you, use words from the keywords list (can be more than one word!):
keywords = [Neutral, Informative, Casual, Assertive, Dismissive, Condescending, Friendly, Commanding, Instructive Derogatory,
Confrontational, Insulting, Vulgar, Formal, Informal, Offensive, Technical, Playful, Positive, Frustration, Analytical, Professional,
Hostile, Hatred, Helpful, Angry, Friendly, Arrogant]
Analysis Structure (do not use “ and [] and "" in your answer and do not suggest improvement!):
{</p>
        <p>Sentence: {{SENTENCE}},
Toxicity Level: Specify here (Low/Medium/High),
Tone: the overall tone of the sentence- choose from keywords,
Language: Language style—choose from keywords,
Implied Sentiment: the overall sentiment- choose from keywords,
Context: Brief description of how context contributes to toxicity,
Negative Connotations: List specific negative words/phrases here,</p>
        <p>Intent: Describe the perceived intent behind the sentence.</p>
        <p>Please cluster the provided sentence using the structure below, when I tell you, use group_index from the groups.
The sentences will clustered into 3 groups while each group has its own characterizes.</p>
        <p>Cluster 0 is more Offensive, Hostile and Vulgar;
Cluster 1 is more Condescending, Derogatory and Hostile;
Cluster 2 is more Informal, Casual, Dismissive.</p>
        <p>Analysis Structure (do not use “ and [] and ”“ in your answer and do not suggest improvement!):
{</p>
        <p>Sentence: {{SENTENCE}},
Toxicity Level: {{TL}},
Tone: {{TONE}},
Language:{{Language}},
Implied Sentiment: {{IS}},</p>
        <p>Cluster: &lt;the cluster index of the sentence&gt;</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Classifying Toxic Text by Features</title>
        <p>
          text features into the model. With its powerful understanding and analytical capabilities, the LLM
can uncover latent semantic connections between text features, thereby achieving more precise and
semantically logical clustering of texts. Figure 2 illustrates the structure of this prompt.
}
2.3. Text Detoxification via Text Features and Classification
To achieve text detoxification, we extended the approach of Dementieva et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] by designing a prompt
structure that leverages both text features and classification information. This approach allows us to
systematically convert toxic sentences into neutral, informal, or casual ones while preserving their
original meaning. By specifying the cluster characteristics and toxicity levels, the LLM can better
understand the context and nuances of the input sentence, enabling it to generate an appropriate
non-toxic version. Figure 3 illustrates the structure of this prompt.
        </p>
        <p>Please detoxify the provided sentence using the structure below without changing the real meaning of the sentence.
The sentences are clustered into 3 groups while each group has its own characterizes.</p>
        <p>Cluster 0 is more Offensive, Hostile and Vulgar;
Cluster 1 is more Condescending, Derogatory and Hostile;
Cluster 2 is more Informal, Casual, Dismissive.</p>
        <p>For each sentence and cluster that I give, make the sentence non-toxic by making it Neutral/Informal/Casual without changing the
meaning.</p>
        <p>Analysis Structure (do not use ” and [] and “” in your answer and do not suggest improvement!):
{</p>
        <p>Sentence: {{Sentence}},
Toxicity level: {{TL}},
Cluster: {{Cluster}},</p>
        <p>Fixed sentence: &lt;the non-toxic sentence after making it Neutral/Informal/Casual without changing the meaning, use the same
language as the original sentence&gt;;
}</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiment</title>
      <sec id="sec-3-1">
        <title>3.1. Dataset</title>
        <p>PAN 2025 multilingual text detoxification task has been improved and expanded in terms of data 1,
evaluation metrics, and task settings. In this year’s competition, six additional languages have been
introduced, none of which provide parallel datasets.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Settings</title>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Evaluation</title>
        <p>
          In this competition, we utilized the DeepSeek-R1[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] provided by Volcengine2 .
        </p>
        <p>The oficial provided four metrics</p>
        <p>
          3 . Each metric component lies in the range [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ].
• Style Transfer Accuracy (STA): Classify its level of non-toxicity.
• Content preservation (SIM): Given two texts (original toxic sentence and generated paraphrase),
evaluate the similarity of their content.
• ChrF1: To estimate the adequacy of the text and its similarity to the human-written detoxified
references.
• Joint (J): To have the one common metric for leaderboard estimation, the oficial will compute
  metrics as the mean of   *   *   per sample.
1https://huggingface.co/datasets/textdetox/multilingual_paradetox_test
2https://www.volcengine.com/
3https://codalab.lisn.upsaclay.fr/competitions/22396#learn_the_detailsevaluation
3.4. Baseline
• golden annotation: Human-written detoxified references.
• baseline_gpt4: A baseline model using GPT-4 for detoxification.
• baseline_mt0: A baseline model using the mT04 model for detoxification.
• baseline_o3mini: A baseline model using the o3-mini model for detoxification.
• baseline_gpt4o: A baseline model using a variant of GPT-4 for detoxification.
• baseline_duplicate: A simple baseline that duplicates the toxic input as the output.
• baseline_backtranslation: A baseline using backtranslation for detoxification. It translates the
input to a language with a strong detoxification model, detoxifies it, and translates it back to the
target language.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>3.5. Result</title>
        <p>The oficial provides an evaluation using the LLM-as-a-Judge approach. Specifically, they have fine-tuned
the Llama-3.1-8B-Instruct 5 model on the manual annotations from the TextDetox2024 6 dataset.</p>
        <p>Table 1 shows the final results of the LLM-as-a-Judge 7 evaluation. The golden annotation model
performed the best, with an average score of 0.828, especially achieving a high score of 0.904 in Japanese.
The Team cake (our method) model ranked second with an average score of 0.674. Although this score is
lower than that of golden annotation, it outperformed all other baseline models. This indicates that the
Team cake model has considerable potential and stability in multilingual evaluation tasks, although there
is still room for improvement in certain languages. Overall, the golden annotation model demonstrated
very stable performance across all languages, while the Team cake model showed competitive strength
comparable to the golden annotation in the majority of languages.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>In this paper, we proposed a multilingual text detoxification method based on a chain-of-thoughts
prompting approach. Our method efectively utilizes large language models to extract toxic text
features, classify toxic sentences, and detoxify them while preserving their original meaning. The
experimental results on the PAN 2025 multilingual text detoxification task show that our method
has considerable potential and stability, outperforming other baseline models. Although there is still
room for improvement in certain languages, our approach demonstrates its efectiveness in handling
multilingual text detoxification tasks. Future work will focus on further enhancing the performance of
our method and exploring more advanced techniques to improve text detoxification across diferent
languages.
4https://huggingface.co/s-nlp/mt0-xl-detox-orpo
5https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
6https://github.com/textdetox/textdetox_clef_2024/tree/main/human_evaluation_results
7https://pan.webis.de/clef25/pan25-web/text-detoxification.html#results</p>
    </sec>
    <sec id="sec-5">
      <title>5. Acknowledgments</title>
      <p>This work is supported by the Social Science Foundation of Guangdong Province, China (No.GD24CZY02)</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, we used a Large Language Model; specifically, we employed
DeepSeek-R1, provided by Volcengine, for the following purposes: extracting toxic text features,
classifying toxic texts, and performing text detoxification. We also used basic AI-powered text-editing
tools to check the grammar and spelling of the manuscript content. After using these tools, we reviewed
and edited the content as needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Logacheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Kozlova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Semenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Methods for detoxification of texts for the russian language</article-title>
          ,
          <source>Multimodal Technologies and Interaction</source>
          <volume>5</volume>
          (
          <year>2021</year>
          )
          <fpage>54</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Voronov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Logacheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Kozlova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Semenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Text detoxification using large pre-trained neural models</article-title>
          ,
          <source>in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>7979</fpage>
          -
          <lpage>7996</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>V.</given-names>
            <surname>Logacheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ustyantsev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dale</surname>
          </string-name>
          , I. Krotova,
          <string-name>
            <given-names>N.</given-names>
            <surname>Semenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Paradetox: Detoxification with parallel data, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics</article-title>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>6804</fpage>
          -
          <lpage>6818</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Logacheva</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Nikishina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fenogenova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dale</surname>
          </string-name>
          , I. Krotova,
          <string-name>
            <given-names>N.</given-names>
            <surname>Semenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Shavrina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          , Russe-2022:
          <article-title>Findings of the first russian detoxification shared task based on parallel corpora</article-title>
          ,
          <source>in: Proceedings of the RUSSE-2022 Shared Task</source>
          ,
          <year>2022</year>
          . doi:
          <volume>10</volume>
          .28995/ 2075-7182-2022-21-114-131.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Logacheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Krotova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fenogenova</surname>
          </string-name>
          , I. Nikishina,
          <string-name>
            <given-names>T.</given-names>
            <surname>Shavrina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>A study on manual and automatic evaluation for text style transfer: The case of detoxification</article-title>
          ,
          <source>in: Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval)</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>90</fpage>
          -
          <lpage>101</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Exploring methods for cross-lingual text style transfer: The case of text detoxification</article-title>
          ,
          <source>in: Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <year>2023</year>
          , pp.
          <fpage>1083</fpage>
          -
          <lpage>1101</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Overview of the multilingual text detoxification task at pan 2024</article-title>
          , in: CEUR Workshop Proceedings, CEURWS.org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ronen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          , et al.,
          <article-title>Multilingual and explainable text detoxification with parallel corpora</article-title>
          ,
          <source>arXiv preprint arXiv:2412.11691</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Song,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , S. Ma,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Bi</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Deepseek-</surname>
          </string-name>
          r1:
          <article-title>Incentivizing reasoning capability in llms via reinforcement learning</article-title>
          ,
          <source>arXiv preprint arXiv:2501.12948</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>