PAN 2024 Multilingual TextDetox: Exploring Different
                         Regimes For Synthetic Data Training For Multilingual Text
                         Detoxification
                         Notebook for PAN at CLEF 2024

                         Nikita Sushko1,*
                         1
                             Skoltech, Bolshoy Boulevard, 30, p.1, 121205, Moscow, Russian Federation


                                         Abstract
                                         Multilingual text detoxification is a style transfer task of creating neutral versions of toxic texts across multiple
                                         languages. In this paper, we use a mix of real and synthetic data to build a multilingual text detoxification
                                         model using a parallel corpus of toxic and non-toxic texts in 9 languages. We evaluate models trained on various
                                         combinations of the training data and determine the optimal training regime. Our proposed approach, which
                                         combines an ensemble model with a toxic word deletion baseline, achieves a top-3 score in automatic evaluations
                                         and a top-4 score in manual evaluations in the TextDetox 2024 shared task.

                                         Keywords
                                         PAN 2024, Multilingual Text Detoxification (TextDetox) 2024, style transfer, multilingual detoxification, text
                                         generation, evaluation, competition, metrics analysis, crosslanguage knowledge transfer, synthetic data


                         1. Introduction
                         The proliferation of online social networks has given rise to new challenges in maintaining safe and
                         respectful digital environments. With the increasing prevalence of toxic language, such as hate speech
                         and profanity, online communities face significant threats to their well-being and cohesion. In response,
                         some social media platforms like VK have implemented measures to classify user-generated content
                         as "toxic" or "non-toxic," offering users alternative responses, like stickers or emojis, to convey their
                         intended meaning without resorting to offensive language. However, these approaches are limited in
                         their ability to address the broader issue of toxic content, since users can simply ignore these suggested
                         stickers and send toxic messages anyway.
                            One promising approach to this problem is text detoxification – a technique aimed at transforming
                         potentially offensive input into neutral output without compromising its original meaning or intent.
                            In this paper, we propose a two-stage algorithm for multilingual text detoxification, using a finetuned
                         bigscience/mt0-xl1 [1] model on a mix of publicly available and synthetic data and deletion of toxic
                         words. The pipeline of synthetic data generation is also presented. Different training regimes with
                         various mixes of synthetic and real data are explored and optimal training regime is determined. The
                         resulting synthetic dataset2 and detoxification model3 are available on HuggingFace.
                            During automatic evaluation, the resulting algorithm achieved third place across all languages and
                         fourth place during manual evaluation in the PAN at CLEF Multilingual Text Detoxification (TextDetox)
                         2024 shared task [2, 3].


                          CLEF 2024: Conference and Labs of the Evaluation Forum, September 09–12, 2024, Grenoble, France
                         *
                           Corresponding author.
                          $ nikita.sushko@skoltech.ru (N. Sushko)
                                      © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                         1
                           bigscience/mt0-xl on HuggingFace https://huggingface.co/bigscience/mt0-xl
                         2
                           chameleon-lizard/synthetic-multilingual-paradetox on HuggingFace https://huggingface.co/datasets/chameleon-lizard/
                           synthetic-multilingual-paradetox
                         3
                           chameleon-lizard/detox-mt0-xl on HuggingFace https://huggingface.co/chameleon-lizard/detox-mt0-xl

CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Previous work
Text detoxification is a relatively new field, which started from a paper by dos Santos et. al. [4], where
they utilized an encoder-decoder translation model, trained with cycle consistency loss to solve the
task of unsupervised detoxification.
   More recently, Laugier et. al. [5] proposed finetuning T5 model [6] on a detoxification task, using
denoising and cyclic autoencoder loss. In RUSSE-2022 shared task [7] further explored non-English
detoxification, with solutions ranging from using decoder-only networks with right prompts to fine-
tuning an encoder-only tagger for toxic words and a style transfer encoder-decoder model for further
detoxification [8].
   In addition to these approaches, a paper by Dale et. al. [9], two algorithms were proposed. CondBERT
approach, inspired by Wu et. al. [10], utilized a finetuned BERT model for replacing toxic tokens in
the sequence to non-toxic. The second approach, ParaGedi, reframes text detoxification problem as a
paraphrase and imposes constraints on toxic tokens used during the generation.
   Authors of [11] proposed finetuning a multilingual mBART model on a big parallel corpus of English
and Russian texts. Their work has shown, that reformulating the task of detoxification as a neural ma-
chine translation task boosts performance of the models, given enough data, outperforming CondBERT
baseline. Also, they’ve proved that finetuning a pretrained multilingual model on any of the languages
it knows, not on the main language of the model, is possible.


3. Data
TextDetox 2024 shared task consisted of two phases. During the dev phase of the task, organizers
provided a training set consisting of 1000 parallel toxic and neutral samples in Russian and English
languages. During the test phase, organizers provided a training set, consisting of 400 parallel toxic and
neutral samples in 9 languages: English, German, Spanish, Amharic, Arabic, Hindi, Chinese, Ukrainian
and Russian.
   Additionally, a non-parallel set of 2500 toxic and 2500 neutral sentences in the same 9 languages was
provided, as well as a dataset of toxic lexicon, consisting of swear words in these languages.

3.1. Metrics
To assess the resulting models and given data, we calculated STA, SIM, chrF_1 and J metrics. STA
metric measured the style transfer quality using the textdetox/xlmr-large-toxicity-classifier4 [12] model.
The SIM metric can be calculated by finding cosine similarity between the embeddings of sentence-
transformers/LaBSE5 [13] model for the input and output (i.e., toxic and neutral sentences). chrF_1 [14]
measures the similarity between model output and the references by using character n-grams. J metric
is a multiplication of STA, SIM and chrF_1 metrics. The calculation of metrics was conducted using the
evaluation script, provided by competition organizers, with toxic examples as the input and neutral
examples as both references and output.

3.2. Data preprocessing
Upon examining the provided data, we found out that it’s quality varied significantly from language to
language.
   As shown in the Table 1, the quality of provided examples is suboptimal in Chinese and Hindi, as the
"neutral" sentences have extremely low STA score. This indicates that only 25% of Chinese and 36% of
Hindi neutral examples are actually non-toxic.
   Furthermore, the neutral sentences in Amharic language are quite distinct from the toxic sentences,
as evidenced by the SIM metric of 0.67.
4
    textdetox/xlmr-large-toxicity-classifier on HuggingFace https://huggingface.co/textdetox/xlmr-large-toxicity-classifier
5
    sentence-transformers/LaBSE on HuggingFace https://huggingface.co/sentence-transformers/LaBSE
Table 1
STA, SIM and amount of real pairs in dirty and cleaned form
                  Language         STA dirty        STA clean     SIM dirty          SIM clean                  Pairs dirty              Pairs clean
                  en               0.87             0.99          0.82               0.85                       400                      328
                  ru               0.87             0.99          0.81               0.84                       400                      321
                  uk               0.88             0.99          0.89               0.90                       400                      347
                  de               0.82             0.99          0.92               0.92                       400                      323
                  es               0.81             0.99          0.82               0.84                       400                      309
                  am               0.90             0.98          0.67               0.80                       400                      241
                  zh               0.25             0.92          0.80               0.83                       400                      84
                  ar               0.79             0.98          0.88               0.89                       400                      309
                  hi               0.36             0.98          0.81               0.86                       400                      124

                          en                  ru            hi                                      en                          ru                        hi
            1.0                                                                     1.0
            0.8                                                                     0.8
            0.6                                                                     0.6
            0.4                                                                     0.4
            0.2                                                                     0.2
            0.0                                                                     0.0
                  0       200    400 0        200   400 0   200    400                    0   100    200       300   0    100     200     300   0        50          100
                          ar                 am             zh                                      ar                          am                        zh
            1.0                                                                     1.0
            0.8                                                                     0.8
            0.6                                                                     0.6
  STA/SIM


                                                                          STA/SIM


            0.4                                                                     0.4
            0.2                                                                     0.2
            0.0                                                                     0.0
                  0       200    400 0        200   400 0   200    400                    0   100        200    300 0       100         200     0   25        50       75
                          de                  uk            es                                      de                          uk                        es
            1.0                                                                     1.0
            0.8                                                                     0.8
            0.6                                                                     0.6
            0.4                                                                     0.4
            0.2                                                                     0.2
            0.0                                                                     0.0
                  0       200    400 0        200   400 0   200    400                    0   100    200       300   0    100     200   300     0   100        200         300
                                         Sentence                                                                        Sentence

                           (a) STA and SIM of real data                         (b) STA and SIM of the real data after cleaning
                      Figure 1: Blue line depicts STA metric. Orange line depicts SIM metric.


  To visualize this, we can sort the sentences in each language by their toxicity scores and plot STA
and SIM scores on a graph by a language (Fig. 1a).
  By applying a hard threshold of 0.6 for both similarity and toxicity metrics, we can filter out noisy
data. However, this approach leads to a drastic reduction in the quantity of examples in Chinese and
Hindi languages, with Chinese being left with only 84 examples and Hindi with 120 examples. In
addition to that, we can also drop all examples longer than 512 symbols to ensure training stability (Fig.
1b).

3.3. Generating synthetic data
Due to the limited amount of data available after removing non-detoxified pairs from the training data,
we need to generate a new dataset. To achieve this, we employed the following algorithm:

     1. Train a detox model on uncleaned dataset;
     2. Run inference of this model on the toxic sentences from unpaired multilingual dataset;
     3. Check if the toxicity classification model classifies the output as non-toxic, if the output of
        detoxification is still toxic, delete all toxic words from the data, using the toxic lexicon dataset;
     4. Check if the toxicity classification model classifies the output as non-toxic;
                       • If the output is toxic, do not add the sentence to the resulting dataset;
                       • If the output is not toxic, add the sentence to the dataset.
                              en                              ru                          hi                                            en                         ru                       hi
            1.0                                                                                                         1.0
            0.8                                                                                                         0.8
            0.6                                                                                                         0.6
            0.4                                                                                                         0.4
            0.2                                                                                                         0.2
            0.0                                                                                                         0.0
                  0     250     500       750      0    500        1000   1500 0    25         50     75                      0   200        400    600 0      500       1000     0   10     20         30
                              ar                           am                             zh                                            ar                         am                       zh
            1.0                                                                                                         1.0
            0.8                                                                                                         0.8
            0.6                                                                                                         0.6
  STA/SIM


                                                                                                              STA/SIM
            0.4                                                                                                         0.4
            0.2                                                                                                         0.2
            0.0                                                                                                         0.0
                  0     200        400         600 0       200            400   0   100         200    300                    0         200             0          100      200   0    50         100
                              de                              uk                          es                                            de                         uk                       es
            1.0                                                                                                         1.0
            0.8                                                                                                         0.8
            0.6                                                                                                         0.6
            0.4                                                                                                         0.4
            0.2                                                                                                         0.2
            0.0                                                                                                         0.0
                  0       100            200       0   500 1000 1500            0   200        400     600                    0    50         100       0    500     1000 1500 0           200           400
                                                       Sentence                                                                                             Sentence

                        (a) STA and SIM of synthetic data                                                    (b) STA and SIM of the synthetic data after cleaning
                      Figure 2: Blue line depicts STA metric. Orange line depicts SIM metric.


Table 2
STA, SIM and amount of synthetic pairs in dirty and cleaned form
                  Language                       STA dirty                 STA clean                  SIM dirty          SIM clean                  Pairs dirty            Pairs clean
                  en                             0.79                      0.99                       0.82               0.84                       818                    617
                  ru                             0.86                      0.99                       0.87               0.88                       1461                   1233
                  uk                             0.91                      0.99                       0.92               0.92                       1778                   1599
                  de                             0.61                      0.97                       0.94               0.93                       237                    143
                  es                             0.70                      0.98                       0.91               0.91                       599                    418
                  am                             0.57                      0.95                       0.89               0.88                       416                    213
                  zh                             0.61                      0.95                       0.83               0.85                       287                    138
                  ar                             0.64                      0.95                       0.90               0.91                       591                    367
                  hi                             0.50                      0.93                       0.87               0.87                       82                     36


   For the toxicity classification model, we utilized the intfloat/multilingual-e5-large model [15], which
was trained on non-parallel data with an 80/20 train-test split. In contrast, for the detox model, we
employed the bigscience/mt0-xl model [1]. We trained it for one epoch on all languages, using the
AdamW optimizer with a learning rate of 1e-4, a constant scheduler, and a batch size of 6. All training
was performed in full precision. The rationale behind choosing this model and its evaluation are
presented in the Experiments section.
   Although the resulting dataset is of lower quality than the real dataset, after applying the same
cleaning procedure, its metrics become comparable to those of the cleaned original dataset (Fig. 2a, 2b,
Table 2). By combining these two datasets, we obtained the training data for the final model.


4. Experiments
4.1. Motivation for choosing the model
There are various approaches to tackle the problem of text detoxification. One possible method is to
employ encoder-only models, such as BERT, to identify toxic words in a sentence, mask them, and then
treat the problem as a denoising task. However, given that we have a dataset consisting of parallel data
(i.e., toxic and neutral versions of the same sentence), it is more intuitive to view this problem as a
sequence-to-sequence task. Therefore, selecting a full transformer model is the obvious choice for this
Table 3
Evaluation metrics of models trained on different data types: Dirty Real (original competition data before
cleaning), Dirty Synthetic (generated data before cleaning), Clean Real (competition data after cleaning), and
Clean Synthetic (generated data after cleaning). Cleaning was done with pipeline, which is explained in 3.2. Eval
data is a 10% random sample of dirty real data. Best results are in bold.
                            Regime                   STA     SIM    chrF_1        J
                            Dirty Real               0.64    0.89   0.70          0.41
                            Dirty Synth              0.69    0.88   0.65          0.41
                            Dirty Real + Synth       0.68    0.85   0.66          0.41
                            Dirty Synth + Real       0.68    0.90   0.69          0.43
                            Dirty Mixed              0.7     0.92   0.69          0.44
                            Cleaned Real             0.71    0.90   0.72          0.477
                            Cleaned Synth            0.71    0.90   0.66          0.437
                            Cleaned Real + Synth     0.72    0.87   0.71          0.44
                            Cleaned Synth + Real     0.73    0.88   0.68          0.454
                            Cleaned Mixed            0.74    0.89   0.73          0.481


problem.
   There are three primary families of multilingual encoder-decoder transformer models: mT5, UMT5,
and mT0. mT5 [1] is a T5-like model [6] trained on multilingual data. UMT5 [16], on the other hand,
shares the same architecture as mT5 but utilizes a novel language sampling algorithm for better dataset
creation. It has been demonstrated that UMT5 models outperform mT5 models of the same size across
a wide range of tasks. mT0 [1], meanwhile, involves fine-tuning mT5 models on an instruction set,
similar to FLAN-T5 [17].
   Our experiments show that fine-tuned mT0 models perform better in the task of text detoxification,
which led us to adopt the mT0 family as the foundation of our detoxification pipeline. Specifically,
we opted for the bigscience/mt0-xl6 [1] model, as it was the largest model that could fit on our GPU
without relying on techniques like LoRA [18]. In addition to mT0-xl, we explored the use of mT5-xl7
and aya-101 models [19]8 . However, mT5-xl underperformed due to the lack of instruction tuning,
while the aya-101 model was too large to be trained on our GPU. We also attempted to utilize LoRA
for this task, but even using high rank hyperparameter, the resulting model’s performance remained
inferior to that of the selected mT0-xl model.

4.2. Exploring different synthetic data training regimes
During training, we explored ten different approaches to training models on synthetic data. We examined
training models on real and synthetic data before and after cleaning, mixing the synthetic and real data
before and after cleaning, and sequentially training on real + synthetic and synthetic + real data in a
two-stage fashion, both before and after cleaning.
    The models were trained using the following parameters: AdamW optimizer [20], inverse square
root scheduler, learning rate (lr) = 8e-5, batch size (bs) = 4. The training was done in full precision. The
models were trained for one epoch.
    The best-performing model, according to evaluation set metrics (Table 3), is the model trained on
a mix of synthetic and real data. We attribute this to the fact that adding synthetic data to the mix
increases the STA metric, which is the hardest metric to optimize. Given enough training steps, the
model learns more toxic words and becomes better at deleting them from the input data. Additionally,
it is interesting to note that training on synthetic data boosts the STA metric and lowers the chrF_1
metric.

6
  bigscience/mt0-xl on HuggingFace https://huggingface.co/bigscience/mt0-xl
7
  google/mt5-xl on HuggingFace https://huggingface.co/google/mt5-xl
8
  CohereForAI/aya-101 on HuggingFace https://huggingface.co/CohereForAI/aya-101
Table 4
First 5 results after automatic evaluation. The leaderboard is based on J metric. Top-3 best results are highlighted
with bold. Top-1 result is both bold and underlined.
      User               average     en       es        de       zh       ar      hi      uk      ru      am
      adugeen            0.523       0.602    0.562     0.678    0.178    0.626   0.355   0.692   0.634   0.378
      lmeribal           0.515       0.593    0.555     0.669    0.165    0.617   0.352   0.686   0.628   0.374
      nikita.sushko      0.465       0.553    0.480     0.592    0.176    0.575   0.241   0.668   0.570   0.328
      VitalyProtasov     0.445       0.531    0.472     0.502    0.175    0.523   0.320   0.629   0.542   0.311
      erehulka           0.435       0.543    0.497     0.575    0.160    0.536   0.185   0.602   0.529   0.287


   Two stage training yields middling results in both chrF_1 and STA, providing better scores than the
worst models. The mixed training regime comes out on top, boasting both higher STA and chrF_1 then
all other training regimes, although with slightly reduced SIM scores.
   Cleaning the data significantly boosts both chrF_1 and STA metrics and moderately improves the
SIM metric. The model trained on cleaned version of the real data, outperforms all models trained on
non-cleaned data, even when we mix in the synthetic data.
   Thus, the optimal approach for training detoxification models in this particular setting is to utilize
the Cleaned Mixed training regime, which involves cleaning both synthetic and real datasets from the
pairs where neutral outputs are still toxic or where the toxic and neutral sentences are dissimilar, and
then mixing them together into one large training set on which the model is trained.

4.3. Final model training
The bigscience/mt0-xl9 model, trained on a mix of synthetic and real data, was used for the final
submission. The training parameters were as follows: AdamW optimizer, inverse square root scheduler,
a learning rate of 8e-5, a batch size of 4. The model was trained in full precision for two epochs.
   To ensure the model generated responses in the correct language, we used the following prompt:
"Write a non-toxic version of the following text in ’language’: ’toxic sentence’." Without this prompt, the
model tended to respond in a language different from the input. The final submission was based on a
combination of answers from different models, taken from different training checkpoints.
   Notably, the models sometimes failed to detoxify sentences and left out words that could be deleted
simply by cutting them out. To address this, each output in the submission pipeline was additionally
detoxed using the "delete" baseline method.


5. Results
Our final model achieved third place in the automatic evaluation and fourth place in the manual human
evaluation.
   During the automatic evaluation, our model consistently ranked within the top three (Table 4), only
being outperformed by other models in Spanish and Hindi. The model visibly struggled with scores
on Chinese and Hindi datasets, where it performed much worse then in other languages. The reason
behind this is that provided data after cleaning was insufficient for training a quality detoxification
model and we had to rely on delete baseline for detoxification on Chinese language. We have tried to
mitigate it by providing it synthetic data, but after cleaning it from non-detoxified samples, the amount
of data was still insufficient for training a good detoxification model on these languages.
   In the human evaluation, our model secured first place in Arabic detoxification and ranked among
the top three models in Arabic, German, and Hindi (Table 5). Notably, our model outperformed human
evaluators in Arabic and German languages in the human evaluation subset. You can see some examples
of detoxification in the Table 6.

9
    bigscience/mt0-xl on HuggingFace https://huggingface.co/bigscience/mt0-xl
Table 5
First 5 results after manual evaluation. The leaderboard is based on J metric. Top-3 best results are highlighted
with bold. Top-1 result is both bold and underlined.
       User                  average     en     es     de      zh      ar      hi     uk     ru     am
       Human References      0.85        0.88   0.79   0.71    0.93    0.82    0.97   0.90   0.80   0.85
       SomethingAwful        0.77        0.86   0.83   0.89    0.53    0.74    0.86   0.69   0.84   0.71
       adugeen               0.74        0.83   0.73   0.70    0.60    0.82    0.68   0.84   0.76   0.71
       VitalyProtasov        0.72        0.69   0.81   0.77    0.49    0.79    0.87   0.67   0.73   0.68
       nikita.sushko         0.71        0.70   0.62   0.79    0.47    0.89    0.84   0.67   0.74   0.68
       erehulka              0.71        0.88   0.71   0.85    0.68    0.78    0.52   0.63   0.65   0.69


Table 6
Examples of detoxification
   Language         Toxic input                                     Neutral output
   English          almost as fucked up as the cia funding and      Almost as bad as the cia funding and arm-
                    arming bin laden.                               ing bin laden.
   German           Die wollen von nix ablenken. Die sind           Die wollen von nix ablenken. Die sind
                    naturgemäß so dumm.                             naturgemäß so naiv.
   Spanish          Que mierda es esa ubicación millenial           Que mala es esa ubicación ahora estoy re
                    ahora estoy re enojado loko                     enojado
   Ukranian         Блять, уйобую нахуй в Острог i не вер-          Уже йду в Острог i не вертаюсь.
                    таюсь. впiзду
   Russian          дело даже не в iq - просто хуесосам там         дело даже не в iq - просто плохим людям
                    не место                                        там не место


6. Conclusions
In conclusion, we have identified the optimal approach to training a multilingual seq2seq model for
text detoxification tasks. When combined with the detoxification via toxic word deletion baseline, our
resulting model achieved third place in the automatic evaluation stage of the PAN 2024 TextDetox
competition. The model and dataset are available for download on HuggingFace.


References
 [1] N. Muennighoff, T. Wang, L. Sutawika, A. Roberts, S. Biderman, T. L. Scao, M. S. Bari, S. Shen,
     Z.-X. Yong, H. Schoelkopf, X. Tang, D. Radev, A. F. Aji, K. Almubarak, S. Albanie, Z. Alyafeai,
     A. Webson, E. Raff, C. Raffel, Crosslingual generalization through multitask finetuning, 2023.
     arXiv:2211.01786.
 [2] D. Dementieva, D. Moskovskiy, N. Babakov, A. A. Ayele, N. Rizwan, F. Schneider, X. Wang, S. M.
     Yimam, D. Ustalov, E. Stakovskii, A. Smirnova, A. Elnagar, A. Mukherjee, A. Panchenko, Overview
     of the multilingual text detoxification task at pan 2024, in: G. Faggioli, N. Ferro, P. Galuščáková,
     A. G. S. de Herrera (Eds.), Working Notes of CLEF 2024 - Conference and Labs of the Evaluation
     Forum, CEUR-WS.org, 2024.
 [3] J. Bevendorff, X. B. Casals, B. Chulvi, D. Dementieva, A. Elnagar, D. Freitag, M. Fröbe, D. Ko-
     renčić, M. Mayerl, A. Mukherjee, A. Panchenko, M. Potthast, F. Rangel, P. Rosso, A. Smirnova,
     E. Stamatatos, B. Stein, M. Taulé, D. Ustalov, M. Wiegmann, E. Zangerle, Overview of PAN 2024:
     Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking
     Analysis, and Generative AI Authorship Verification, in: Experimental IR Meets Multilinguality,
     Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the
     CLEF Association (CLEF 2024), Lecture Notes in Computer Science, Springer, Berlin Heidelberg
     New York, 2024.
 [4] C. N. dos Santos, I. Melnyk, I. Padhi, Fighting offensive language on social media with unsupervised
     text style transfer, 2018. arXiv:1805.07685.
 [5] L. Laugier, J. Pavlopoulos, J. Sorensen, L. Dixon, Civil rephrases of toxic texts with self-supervised
     transformers, in: P. Merlo, J. Tiedemann, R. Tsarfaty (Eds.), Proceedings of the 16th Conference of
     the European Chapter of the Association for Computational Linguistics: Main Volume, Association
     for Computational Linguistics, Online, 2021, pp. 1442–1461. URL: https://aclanthology.org/2021.
     eacl-main.124. doi:10.18653/v1/2021.eacl-main.124.
 [6] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu, Exploring
     the limits of transfer learning with a unified text-to-text transformer, 2023. arXiv:1910.10683.
 [7] D. Dementieva, V. Logacheva, I. Nikishina, A. Fenogenova, D. Dale, I. Krotova, N. Semenov,
     T. Shavrina, A. Panchenko, Russe-2022: Findings of the first russian detoxification shared task
     based on parallel corpora, 2022, pp. 114–131. doi:10.28995/2075-7182-2022-21-114-131.
 [8] I. Gusev, Russian texts detoxification with levenshtein editing, 2022. arXiv:2204.13638.
 [9] D. Dale, A. Voronov, D. Dementieva, V. Logacheva, O. Kozlova, N. Semenov, A. Panchenko, Text
     detoxification using large pre-trained neural models, 2021. arXiv:2109.08914.
[10] X. Wu, S. Lv, L. Zang, J. Han, S. Hu, Conditional bert contextual augmentation, in: J. M. F.
     Rodrigues, P. J. S. Cardoso, J. Monteiro, R. Lam, V. V. Krzhizhanovskaya, M. H. Lees, J. J. Dongarra,
     P. M. Sloot (Eds.), Computational Science – ICCS 2019, Springer International Publishing, Cham,
     2019, pp. 84–95.
[11] D. Moskovskiy, D. Dementieva, A. Panchenko, Exploring cross-lingual text detoxification with
     large multilingual language models., in: S. Louvan, A. Madotto, B. Madureira (Eds.), Proceedings
     of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research
     Workshop, Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 346–354. URL:
     https://aclanthology.org/2022.acl-srw.26. doi:10.18653/v1/2022.acl-srw.26.
[12] textdetox, xlmr-large-toxicity-classifier model on huggingface, https://huggingface.co/textdetox/
     xlmr-large-toxicity-classifier, 2024. Accessed: 2024-05-15.
[13] F. Feng, Y. Yang, D. Cer, N. Arivazhagan, W. Wang, Language-agnostic bert sentence embedding,
     2022. arXiv:2007.01852.
[14] M. Popović, chrF: character n-gram F-score for automatic MT evaluation, in: O. Bojar, R. Chatterjee,
     C. Federmann, B. Haddow, C. Hokamp, M. Huck, V. Logacheva, P. Pecina (Eds.), Proceedings of the
     Tenth Workshop on Statistical Machine Translation, Association for Computational Linguistics,
     Lisbon, Portugal, 2015, pp. 392–395. URL: https://aclanthology.org/W15-3049. doi:10.18653/v1/
     W15-3049.
[15] L. Wang, N. Yang, X. Huang, L. Yang, R. Majumder, F. Wei, Multilingual e5 text embeddings: A
     technical report, arXiv preprint arXiv:2402.05672 (2024).
[16] H. W. Chung, X. Garcia, A. Roberts, Y. Tay, O. Firat, S. Narang, N. Constant, Unimax: Fairer
     and more effective language sampling for large-scale multilingual pretraining, in: The Eleventh
     International Conference on Learning Representations, 2023. URL: https://openreview.net/forum?
     id=kXwdL1cWOAi.
[17] H. W. Chung, L. Hou, S. Longpre, B. Zoph, Y. Tay, W. Fedus, Y. Li, X. Wang, M. Dehghani,
     S. Brahma, A. Webson, S. S. Gu, Z. Dai, M. Suzgun, X. Chen, A. Chowdhery, A. Castro-Ros,
     M. Pellat, K. Robinson, D. Valter, S. Narang, G. Mishra, A. Yu, V. Zhao, Y. Huang, A. Dai, H. Yu,
     S. Petrov, E. H. Chi, J. Dean, J. Devlin, A. Roberts, D. Zhou, Q. V. Le, J. Wei, Scaling instruction-
     finetuned language models, 2022. arXiv:2210.11416.
[18] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, Lora: Low-rank
     adaptation of large language models, 2021. arXiv:2106.09685.
[19] A. Üstün, V. Aryabumi, Z.-X. Yong, W.-Y. Ko, D. D’souza, G. Onilude, N. Bhandari, S. Singh,
     H.-L. Ooi, A. Kayid, F. Vargus, P. Blunsom, S. Longpre, N. Muennighoff, M. Fadaee, J. Kreutzer,
     S. Hooker, Aya model: An instruction finetuned open-access multilingual language model, 2024.
     URL: https://arxiv.org/abs/2402.07827. arXiv:2402.07827.
[20] I. Loshchilov, F. Hutter, Decoupled weight decay regularization, 2019. arXiv:1711.05101.