=Paper=
{{Paper
|id=Vol-3740/paper-296
|storemode=property
|title=RAG Meets Detox: Enhancing Text Detoxification Using Open Large Language Models with
Retrieval Augmented Generation
|pdfUrl=https://ceur-ws.org/Vol-3740/paper-296.pdf
|volume=Vol-3740
|authors=Erik Řehulka,Marek Šuppa
|dblpUrl=https://dblp.org/rec/conf/clef/RehulkaS24
}}
==RAG Meets Detox: Enhancing Text Detoxification Using Open Large Language Models with
Retrieval Augmented Generation==
RAG Meets Detox: Enhancing Text Detoxification Using Open Large Language Models with Retrieval Augmented Generation Notebook for PAN at CLEF 2024 Erik Řehulka1 , Marek Šuppa1,2 1 Comenius University, Bratislava, Slovakia 2 Cisco Systems Abstract In this work we present our solution at the Multilingual Text Detoxification 2024 task, whose objective is to take toxic text and convert into one that conveys the same meaning without containing any toxicity. Our approach utilizes open Large Language Models extended with dynamic prompt creation combined with Retrieval Augmented Generation. The evaluation results show that despite its simplicity, our method has the potential to provide competitive results, as evidenced by both the automatic and manual evaluation executed by the task organizers. Overall, our approach ranked 5th in the manual evaluation, with our best-performing language, German, even surpassing the human reference. Keywords PAN 2024, Retrieval Augmented Generation, text detoxification, Llama3, LLM, toxic text 1. Introduction The task of identification of toxicity in texts is an active area of research. Social networks are trying to address this problem by simply blocking such texts. A more interesting and effective approach might be to automatically rewrite these texts, so that they are ideally no longer toxic, but their meaning is kept intact. This processed is denoted as detoxification. The Multilingual Text Detoxification (TextDetox) 2024 task aims to create and explore such methods. The participants are provided with a dataset of toxic texts in several languages from all over the globe, which then should be detoxified. The goal is to find a method, which after evaluation provides texts which are neutral, but their meaning is the same as the toxic text on the input. We explore how a data scientist with only an API access to a Large Language Model (LLM), in our case Llama3 can develop effective solutions for this task. We did not fine-tune or alter in any way, the only approach was to creatively adjust prompts given to the LLM, so that its outputs will get highest score possible. We have developed several methods, from the simple ones like zero shot prompting, to utilizing existing datasets of text detoxifications and generating these prompts dynamically considering the input text to be detoxified. For this we have used external tools like vector databases containing pairs of toxic texts and their neutral counterparts, which were queried using embedding of the toxic texts. We have found this method to be competitive and despite its simplicity it achieved high ranking in this task. We submitted our results under the usernames erehulka and mareksuppa to the CodaLab portal. Our best-performing languages, in comparison to other participants’ submissions, were German and Chinese. Notably, our score for German even surpassed the human references in the manual evaluation. Overall, our approach ranked 5th in the manual evaluation. A more detailed discussion of our results is provided in Section 6. CLEF 2024: Conference and Labs of the Evaluation Forum, September 09–12, 2024, Grenoble, France $ rehulka3@uniba.sk (E. Řehulka); marek@suppa.sk (M. Šuppa) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 2. Related Work Most of the work in this area has been done mainly for the English and Russian language. The first study on automatic detoxification of Russian language was introduced in 2021 [1], where two models were suggested: a BERT-based approach for local corrections and a supervised approach based on a pretrained GPT-2 model. Another contribution presented two unsupervised methods for text detoxification [2]. The first method combined style-conditional language models with paraphrasing models to perform style transfer, while the second method used BERT to replace toxic words with non-offensive synonyms. In the realm of parallel data collection for the task of detoxification, a ParaDetox pipeline was introduced [3]. It collected non-toxic paraphrases of toxic sentences. This work provided new parallel corpora for training detoxification models, demonstrating that models trained on this data outperformed existing unsupervised approaches by a significant margin. The challenges of multilingual and cross-lingual detoxification were explored by investigating the behavior of large multilingual models [4]. They found these models can do multilingual style transfer, but it is not possible to do cross-lingual detoxification without fine-tuning. The RUSSE-2022 shared task [5] was a competition similar to this one, focused on detoxification methods for Russian, featuring parallel training data and manual evaluation. The study found that the best performance for the Russian language was achieved using the ruT5 model, which was fine-tuned on parallel data. In this context, parallel data refers to each toxic record having a corresponding neutral counterpart. A study has been made in the area of evaluating the quality of the detoxification process, suggesting ChrF and BertScore metrics can be used for this task [6]. The ChrF measure is used even in this task for the computation of the similarity of meaning between the original toxic text and its detoxified version. 3. Datasets In our work we utilized two datasets, both published by the competition organizers. Both of these datasets are available on the HuggingFace platform. The first dataset is textdetox/multilingual_toxic_lexicon [7], which contains a set of toxic words for each of the input languages. The number of toxic words for each language differed, with the exact numbers depicted in Table 1. As Table 1 shows, these numbers differ a lot between the languages. For example for languages like Amharic, Hindi or even German there are only a couple hundreds of the records, and for the Russian language there is around 141,000 records. These words were used as a lexicon of toxic words, which were passed to some of the prompts, as examples of words which must not be present in the resulting detoxified text. Table 1 Number of Records per Language Language Abbreviation Number of Records Amharic am 245 Spanish es 1,200 Russian ru 141,000 Ukrainian uk 7,360 English en 3,390 Chinese zh 3,840 Arabic ar 430 Hindi hi 133 German de 247 The second dataset is textdetox/multilingual_paradetox [8], which was published by the organizers after the development phase has ended and contains both the original as well as detoxified inputs for the development phase. We have used this dataset to dynamically create the prompt using Retrieval Augmented Generation (RAG), more of which is discussed in Section 4.4. For each language the dataset contained exactly 400 records. 4. System Description On input our system receives a toxic text to be detoxified, along with an abbreviation of the language of the input. All of the languages present on the input with their respective abbreviations can also be seen in Table 1. Our approach was to create a prompt which then would be passed to the Llama3 model, such that its result would be the detoxified text. Here we present the different methods we used, starting from a simple zero-shot prompting, and finally using retrieval augmented generation with a simple lexicon. 4.1. Zero Shot In the first phase we explored how efficient is zero-shot approach with just the instructions on how to detoxify the input, with the input at the end. The prompt can be seen in Appendix A and only specifies the task of detoxification. When it comes to different languages, there was no specification on what is the language, just that the result must be in the same language as the input. With just this knowledge the model outputs were sometimes very different from the input, because it tried to rewrite the whole sentence, to keep the meaning and not be toxic. However the specification of toxic text is very broad, so we needed to somehow specify how the output should look for given input, so it will just rewrite the toxic parts and ideally keep the rest of the text as it is. Thus we have moved to providing examples of input-output pairs for the model. 4.2. Few Shot with language specification and examples Next we added some examples mainly in English, and also in other languages to the prompt, along with specifying what is the language of the input. The prompt can be seen in Appendix B, with the {language} filled in from Table 1. This has proved to obtain better results in languages other than English. The English examples during the development phase were actually retrieved from the s-nlp/paradetox dataset on the HuggingFace platform, which was published before the competition and mentioned in the competition guidelines. Because there was such dataset only for English and Russian (s-nlp/ru_paradetox), the examples for other languages were actually used from the output of our model. 4.3. Separate prompts for each language We have also experimented with creating a separate prompt for each language, with the examples only from the language. We tried to even translate the whole prompt from English to the given language. However this has not proved to be a great improvement, for some languages as Hindi or Amharic it was even worse than the previous approaches, so we have dropped this idea, and for the most languages we have created the prompt in English, with only the examples of input and output being in the target language. Since these methods utilized no outer knowledge not known to the model, we needed to somehow fill in the examples from real data, which contained pairs of toxic text, and its detoxified version for each language. This dataset has been released after the development phase, containing 400 such examples for each language. With this we have moved to utilizing Retrieval-Augmented Generation and generating the examples such that they will be most similar to the text we are actually trying to detoxify. 4.4. Retrieval Augmented Generation (RAG) RAG, or Retrieval-Augmented Generation, enhances the capabilities of large language models (LLMs) by incorporating references from knowledge bases external to their training data. These LLMs are trained on extensive datasets and utilize billions of parameters to perform tasks such as answering queries, language translation, and text completion. The main difference from model fine-tuning is that fine-tuning involves adjusting the parameters of a pretrained language model to adapt it to a specific task or domain. This process typically involves training the model on a smaller, task-specific dataset to refine its understanding and performance in that particular area. Fine-tuning essentially tweaks the existing parameters of the model to specialize its capabilities. On the other hand, RAG incorporates references from external knowledge bases to enhance the output of a language model without retraining it. Instead of modifying the model’s parameters, RAG augments its generation process by retrieving relevant information from authoritative sources and fills the prompt by adding this retrieved information. This approach is not only cost-effective but also ensures that LLM-generated content maintains relevance, accuracy, and utility across diverse contexts. For this reason, we have chosen to explore the performance of RAG in this specific task, so we won’t have to fine-tune existing models, but use them in their current state. 4.4.1. Prompt Template The complete prompt is available in Appendix C. It consists of several elements, most of which are dynamically generated according to the specific input. However, the prefix, which directs the LLM to act as a text detoxifier remains consistent across all prompts: ## Task You are a text detoxifier. On input you receive a text which may be toxic or harmful. Your task is to rewrite this text in a way that does not contain any toxicity or harmful words, while preserving the original content and context. The Output contains only the detoxified text and nothing else like notes or additional information. You do not add any more context to the resulting text, which is not in the original text. Do not rewrite the original text too much, just either remove the toxic part completely, or replace it with some non-toxic words while preserving the meaning and context. The language of the input is {language} and the language of the response must be the same. The language (in the prompt as {language}) is filled in based on the language of the input, as mentioned before. The most of the remaining prompt is generated based on the specific input to be detoxified. 4.4.2. Retrieval augmentation In the used prompt, there is also a space for some examples, mainly in the following part: ## Toxic words The input text may contain offensive or harmful words. You should either remove them or replace them with non-offensive words. Here are some examples of toxic words you may find in the input text: {toxic_words} These CAN NOT be used in the output text. You must replace them with non- toxic words. If that is not possible, remove them completely. ## Examples {examples} As the listing shows, the prompt contains parts denoted as ## Toxic words and ## Examples, which contain a number of examples that are filled in based on the input text. For this we have used the method of Retrieval Augmentation, where we get a number of examples for both parts relevant for the input text. For the ## Toxic words part we use the multilingual_toxic_lexicon dataset, from which we get all of the words from the input text present in the lexicon for the input language, along with five random words from this dataset, to make sure that there will always be at least five examples of toxic words. For the selection of words present in the input we have simply filtered those words, which are exactly present in the input text, without creating an index or otherwise. Each of this word is added to the prompt in format - {word}\n. For the second part with ## Examples we use RAG. We utilize the Chroma vector database 1 to create an index of embeddings from the mentioned textdetox/multilingual_paradetox dataset. We generate the index using the LaBSE transformer model, since this model has been used by the organizers in evaluating the submissions. Each record from the dataset is saved in the collection with following fields: • embeddings - embedded toxic sentence, • documents - unmodified toxic sentence, • metadatas - neutral (detoxified) sentence for that row, along with the language of the input, • ids - necessary id for the database. Instead of creating a separate index for each of the languages, we have created only one and then retrieve data from the index by specifying which language we want to retrieve. At inference time the same embedding model on which the model was created provides embedding of the sample being evaluated and this embedding is used to query the database, which will return k closest items from its index for the same language as the sample. We have set the value of k to 10. After getting these closest items, we add them to the prompt, in format: Input: {text}\nOutput: {metadata[’neutral_sentence’]}\n\n, with text being the original toxic sentence and neutral_sentence being detoxified. 4.4.3. Delete baseline For the Amharic language, this approach has not proved to be ideal. The responses of the Llama3 model for the prompts took much longer than for the other languages, and the resulting score was much lower compared to other participant’s submissions. Because of this, we have used for this specific language an approach similar to the one used in delete baseline, that is just deleting toxic words from the input and keeping the rest as is. For this we have removed all words from the input, which were present in the toxic words lexicon dataset for the Amharic language, and returned this modification as the detoxified text. Nothing else has been done on this particular language, thus it has not been passed through the same pipeline as the one mentioned in this section. 1 https://www.trychroma.com/ 5. Output cleanup Since the model not always outputs only the detoxified texts, we needed to cleanup the result, so that only the detoxified text would be present. The model sometimes included a “Note“ saying something about the detoxified text, or it would prefix the result with “Translation:“ or “Output“. We have removed these texts completely, and also for the notes we even removed all text which was after the “Note“ keyword. This is because the note was always at the end of the model output, so we have not lost any information this way. This has been present mainly in the first approaches like zero-shot, for the last part with RAG we have not used this. Also the model sometimes responded with the text in quotes, so in that case we have removed the quotes from beginning and end of the response. For several languages the format of the quotes was different, so we remove these types of quotes: ", “, «, ’. 6. Results and Discussion In this section we describe the main results obtained as part of the Task, in which the outputs of the submitted systems were evaluated both automatically as well as by manually labeling a subsample of 100 texts per language via crowdsourcing. To evaluate the results, three criteria are assessed: 1. The absence of offensive content or toxic language in the text. For this purpose, a specifically fine-tuned xlm-roberta-large model for toxicity binary classification task is used [9]. 2. The preservation of the original meaning in the detoxified text. This is quantified by calculating the cosine similarity between LaBSE embeddings. 3. The grammatical correctness of the detoxified text. This is evaluated using the ChrF metric [10]. Each of these metrics ranges from 0 to 1. To derive a composite metric, the final score for a given result is computed as the product of all three metrics, referred to as the joint metric. Subsequently, the scores for individual languages are calculated as the mean of all scores for that language, and the overall score is obtained by averaging the scores across all languages. The results of the automatic evaluation can be seen in Table 2. In it, we outline all of the system configurations in order to show their impact on the final score. The very first line describes the results obtained by the "duplicate" baseline, which was included to highlight and contrast the baseline performance with the presented models. In general, we can conclude that all of the proposed models managed to beat the baseline when considering the average performance across all languages, in some cases by a significant margin. The zero-shot approach, denoted as ”LLama 3” in Table 2 obtained average performance of 0.380. Extending it with Retrieval Augmented Generation (RAG) has improved the average performance to 0.403. We then noticed that when retrieving the examples to include inside the prompt, which is an important part of the RAG pipeline, the resulting examples might be in a language different from the output language. This has the potential of confusing the model during the generation process and hence we limited the examples to only include those that are in the output language (” + select”). We further hypothesized that instructing the model to ensure specific toxic words were not present in the final output might have a positive impact on its quality. We hence implemented an approach in which all the toxic words we obtained from the textdetox/multilingual_toxic_lexicon dataset were included in the prompt if they could be found in the toxic sentence to be detoxified. We further added 5 more toxic words to the prompt to allow the model to see more samples of toxic words to remove from the input. This yielded the performance of 0.420. We further experimented with the order of examples in the prompt and found that reversing their order (e.g. making sure the closese example from the validation set is mentioned the last in the prompt) has had positive impact on performance (which we denote as ”+ reverse” in Table 2. In order to see what impact does the embedding part of RAG have on the final preformance we also experiment with the embedding model and make use of the multilingual-e5-large model which obtained state-of-the-art performance on many retrieval Table 2 Performance metrics across different languages for various models and their components, evaluated as part of automatic evaluation. The best performance per language is boldfaced. Model Average en es de zh ar hi uk ru am baseline 0.126 0.061 0.090 0.287 0.069 0.294 0.035 0.032 0.048 0.217 Llama 3 0.380 0.525 0.448 0.530 0.161 0.488 0.185 0.507 0.461 0.112 + RAG 0.403 0.527 0.483 0.576 0.152 0.483 0.176 0.534 0.504 0.193 + select 0.409 0.532 0.488 0.577 0.152 0.519 0.188 0.561 0.517 0.146 + lexicon 0.418 0.543 0.497 0.575 0.160 0.536 0.185 0.602 0.529 0.135 + reverse 0.420 0.527 0.499 0.563 0.169 0.538 0.193 0.602 0.523 0.167 + multiling 0.424 0.537 0.492 0.577 0.156 0.547 0.181 0.615 0.540 0.173 + am delete 0.437 0.537 0.492 0.577 0.156 0.547 0.181 0.615 0.540 0.287 tasks in the multilingual setup [11]. As the results in Table 2 suggest, it also had a positive impact in our case, albeit a relatively small one. Finally, comparing the per-language performance we noticed that the performance is steadily improving for all languages except for Amharic (am). We hence decided to re-implement the delete baseline for this langauge, which was supposed to bring its performance to about 0.270. Our repimplementation has brought the performance to 0.278 as depicted in the ”+ am delete” line. As Table 3 shows, our final model has done relatively well in many languages in manual evaluation – in English its performance was tied with human references whereas for German it even surpassed the human references. We do note, however, that the model has performed surprising poorly for Hindi (hi), suggesting that there is room for potential improvement in the future, for instance by updating the language-specific prompts in this particular case. Table 3 Performance metrics across different languages for various models and their components, evaluated as part of the manual evaluation of a random subsample of 100 texts via crowdsourcing. Model Average en es de zh ar hi uk ru am human reference 0.85 0.88 0.79 0.71 0.93 0.82 0.97 0.90 0.80 0.85 ours 0.71 0.88 0.71 0.85 0.68 0.78 0.52 0.63 0.65 0.69 7. Conclusion In this study, we assess how well the Llama3 model performs when enhanced with retrieval augmented generation and a lexicon of toxic words for the purpose of text detoxification. We investigate the effectiveness of utilizing the Llama3 model in this context. By populating the prompt with pairs of toxic and neutral text examples, as well as toxic words from the lexicon, we achieve promising results. Our approach ranks 5th in manual evaluations of the results, with the 4th best approach having the same score and the best approach being better by only 0.06, indicating competitive performance. Particularly noteworthy is the strong performance of languages such as German and Chinese, in which our approach emerged as the second best performer. Acknowledgements This research was partially supported by grant APVV-21-0114. References [1] D. Dementieva, D. Moskovskiy, V. Logacheva, D. Dale, O. Kozlova, N. Semenov, A. Panchenko, Methods for detoxification of texts for the russian language, Multimodal Technologies and Interaction 5 (2021). URL: https://www.mdpi.com/2414-4088/5/9/54. doi:10.3390/mti5090054. [2] D. Dale, A. Voronov, D. Dementieva, V. Logacheva, O. Kozlova, N. Semenov, A. Panchenko, Text detoxification using large pre-trained neural models, in: M.-F. Moens, X. Huang, L. Specia, S. W.-t. Yih (Eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 2021, pp. 7979–7996. URL: https://aclanthology.org/2021.emnlp-main.629. doi:10.18653/v1/2021.emnlp-main.629. [3] V. Logacheva, D. Dementieva, S. Ustyantsev, D. Moskovskiy, D. Dale, I. Krotova, N. Semenov, A. Panchenko, Paradetox: Detoxification with parallel data, 2022, pp. 6804–6818. doi:10.18653/ v1/2022.acl-long.469. [4] D. Moskovskiy, D. Dementieva, A. Panchenko, Exploring cross-lingual text detoxification with large multilingual language models, in: S. Louvan, A. Madotto, B. Madureira (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, ACL 2022, Dublin, Ireland, May 22-27, 2022, Association for Computational Linguistics, 2022, pp. 346–354. URL: https://aclanthology.org/2022.acl-srw.26. [5] D. Dementieva, V. Logacheva, I. Nikishina, A. Fenogenova, D. Dale, I. Krotova, N. Semenov, T. Shavrina, A. Panchenko, Russe-2022: Findings of the first russian detoxification shared task based on parallel corpora, 2022, pp. 114–131. doi:10.28995/2075-7182-2022-21-114-131. [6] V. Logacheva, D. Dementieva, I. Krotova, A. Fenogenova, I. Nikishina, T. Shavrina, A. Panchenko, A study on manual and automatic evaluation for text style transfer: The case of detoxification, Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval) (2022). URL: https://api.semanticscholar.org/CorpusID:248780050. [7] TextDetox, Multilingual toxic lexicon, https://huggingface.co/datasets/textdetox/multilingual_ toxic_lexicon, 2023. Accessed: 2024-05-30. [8] TextDetox, Multilingual paradetox, https://huggingface.co/datasets/textdetox/multilingual_ paradetox, 2023. Accessed: 2024-05-30. [9] D. Dementieva, D. Moskovskiy, N. Babakov, A. A. Ayele, N. Rizwan, F. Schneider, X. Wang, S. M. Yimam, D. Ustalov, E. Stakovskii, A. Smirnova, A. Elnagar, A. Mukherjee, A. Panchenko, Overview of the multilingual text detoxification task at pan 2024, in: G. Faggioli, N. Ferro, P. Galuščáková, A. G. S. de Herrera (Eds.), Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR-WS.org, 2024. [10] M. Popović, chrF: character n-gram F-score for automatic MT evaluation, in: O. Bojar, R. Chatterjee, C. Federmann, B. Haddow, C. Hokamp, M. Huck, V. Logacheva, P. Pecina (Eds.), Proceedings of the Tenth Workshop on Statistical Machine Translation, Association for Computational Linguistics, Lisbon, Portugal, 2015, pp. 392–395. URL: https://aclanthology.org/W15-3049. doi:10.18653/v1/ W15-3049. [11] L. Wang, N. Yang, X. Huang, L. Yang, R. Majumder, F. Wei, Multilingual e5 text embeddings: A technical report, 2024. arXiv:2402.05672. A. Zero Shot prompt template You are a text de-toxifier. You receive a toxic text and your task is to re- write the toxic parts in a non-toxic way while saving the main content. Do not add more text, if the text is short, keep it short. Keep the non- toxic words in the text the same. You do not respond anything else, just the rewritten, non-toxic text. The input is in "" and can be in any language. The result must be in the same language as the input. "{phrase }" B. Few Shot with languages and examples prompt template We haven’t included the entire prompt due to its length, but there are similar examples for other languages where it says. You are a text detoxifier. On input you receive a text which may be toxic or harmful. Your task is to rewrite this text in a non harmful or toxic way while preserving the original content and context. Output only contains the detoxified text, nothing else like notes or additional information. You do not add any more context to the resulting text, which is not in the original text. Try not to rewrite the original text too much, just either remove the toxic part completely, or replace it with some non-toxic words while preserving the meaning and context. The language of the input is {language} and the language of the output must be the same. Here are some examples in English what should be the output for given texts: - Input: "she is looking like shite .", Output: "she is looking like bad", - Input: "btw jonhson county is a real shithole in case you were wondering .", Output: "Johnson county is not good.", - Input: "why are canadians so taken up with hockey when their teams are all losers ?", Output: "Why are canadians so taken up woth hockey?", - Input: "keep crying me a river , bitch boy .", Output: "keep crying me a river , ya boy .", - Input: "until feminists take on islam and how it treats women , the entire movement is a total farce !", Output: "Feminism is not a good movement", - Input: "countless crap has been spotted by me .", Output: "countless wrong things has been spotted by me .", or in other languages: - Input (in German): "dieser Holtmann scheint ein Grüner-Bastard zu sein! #BerichtAusBerlin", Output: "Dieser Holtmann scheint ein Grüner zu sein! #BerichtAusBerlin" - Input (in Spanish): "pues hazme los deberes furcia barata", Output: "Pues hazme los deberes a tiempo barata." Provide only the detoxified text as the output, no additional information . Input (in {language}): "{phrase}" C. RAG prompt ## Task You are a text detoxifier. On input you receive a text which may be toxic or harmful. Your task is to rewrite this text in a way that does not contain any toxicity or harmful words, while preserving the original content and context. The Output contains only the detoxified text and nothing else like notes or additional information. You do not add any more context to the resulting text, which is not in the original text. Do not rewrite the original text too much, just either remove the toxic part completely, or replace it with some non-toxic words while preserving the meaning and context. The language of the input is {language} and the language of the response must be the same. ## Toxic words The input text may contain offensive or harmful words. You should either remove them or replace them with non-offensive words. Here are some examples of toxic words you may find in the input text: {toxic_words} These CAN NOT be used in the output text. You must replace them with non- toxic words. If that is not possible, remove them completely. ## Examples {examples} ## Input Your output is only the detoxified text, you do not say anything else. Input: {phrase} Output: