<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Team Detox at TextDetox CLEF 2025: Multilingual Text Detoxification using LLM</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gopala Krishna Nuthakki</string-name>
          <email>sivagopalkrishna04@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lekkala Sai Teja</string-name>
          <email>lekkalad_ug_22@cse.nits.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Atul Mishra</string-name>
          <email>atul.mishra@bmu.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science &amp; Engineering, BML Munjal University</institution>
          ,
          <addr-line>Haryana</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Computer Science &amp; Engineering, National Institute of Technology</institution>
          ,
          <addr-line>Silchar</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Toxic online language poses a severe threat to the safety and inclusivity of users on many online websites. This work is a part of PAN at CLEF 2025 shared task named Multilingual Text Detoxification (TextDetox) shared task 2025, which tries to convert toxic texts into non-toxic ones while preserving semantic meaning in diferent languages that range from being high-resource to underrepresented ones. The dataset used in this task consists of toxic sentences in 15 languages from around the globe such as English, Spanish, German, Chinese, Arabic, Hindi, Ukrainian, Russian, Amharic, Italian, French, Hebrew, Hinglish, Japanese and Tatar, as provided by the CLEF PAN-25 initiative. Our method, applicable to all 15 languages, the approach begins by identifying and masking toxic words in input sentences. The original and masked versions are then provided to large language models (LLMs) to generate detoxified outputs that retain the intended meaning while eliminating ofensive language. These experiments are evaluated based on style accuracy, semantic preservation, and fluency. The study show competitive results across multiple languages, highlighting the efectiveness of a hybrid approach in multilingual style transfer tasks. Our results further go toward inclusive and robust moderation tools that allow safer communication in multilingual digital spaces. All our codes can be seen on GitHub1."</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;PAN 2025</kwd>
        <kwd>Large Language Models</kwd>
        <kwd>Text Detoxification</kwd>
        <kwd>Multilingual</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Online interactions are becoming multilingual, and while this global reach is of immense value, it also
poses the challenge of moderating ofensive and toxic content [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] in linguistic and cultural contexts.
Maintaining online discussions respectfully and safely requires methods that can efectively detoxify
language without watering down the intended meaning of a message.
      </p>
      <p>
        The Multilingual Text Detoxification (TextDetox) 2025 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] shared task entails translating toxic
usergenerated content into non-toxic content without sacrificing the semantic meaning of the source content
in languages. This task is hosted as part of a broader efort to combat online toxicity, moving beyond
conventional content moderation strategies that rely on blocking or removing harmful texts. Instead,
the goal is to proactively rewrite toxic content, preserving its core message while eliminating ofensive
or obscene language. The task focuses on explicit toxicity, which includes direct use of obscene or rude
lexicons where neutral content can still be extracted and preserved.
      </p>
      <p>One of the main challenges in this area is the variability in toxic phrasing across languages and
dialects. The vast majority of languages lack labeled data for this issue, and supervised learning becomes
unfeasible. In addition, translating detoxed content often loses its meaning or gets misinterpreted as
culturally unpleasant.</p>
      <p>To address these challenges, The proposed study used a hybrid approach that combines rule-based
masking with model-guided generation. First, toxic spans were identified using keyword-based lexicons.
Such lexicons were subsequently masked to reduce generation bias. By feeding the original and
masked sentences into multilingual large language models, we ensured that the detoxified outputs were
semantically aligned with the input and efectively removed ofensive content. This two-input method
allowed the model to better identify contextual nuances, especially in code-mixed and low-resource
languages, and improved the outputs.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Dataset Description</title>
      <p>
        The dataset, acquired via CLEF 2025 PAN [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] , contains 400 parallel sentence pairs containing a toxic
and detoxified version for 9 languages: English, Spanish, German, Chinese, Arabic, Hindi, Ukrainian,
Russian, and Amharic. In addition, the dataset also includes a Toxic Keywords List for 15 languages
mentioned above in the shared task. This resource captures commonly occurring ofensive or toxic
terms and phrases, serving both as lexical guidance for detoxification models and as a benchmark for
evaluating toxic span identification. Furthermore, toxic span annotations are available for the same 9
languages with parallel data, enabling more fine-grained analysis.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>The section outlines the specific design and development approach used for developing our multilingual
text detoxification system. We explain the architecture, low-level techniques behind identifying and
cleaning up toxic text to create detoxified versions of them, and how implementation strategies hold
across languages. The objective of the study is to develop a generalizable system in support of several
languages and cleaning up toxic input without sacrificing any original meaning or context.</p>
      <sec id="sec-3-1">
        <title>3.1. Overall System Architecture</title>
        <p>The system is designed to accommodate a multilingual detoxification pipeline in which input text in
toxic form is converted into its non-toxic equivalent for diferent languages. The system starts with a
toxic sentence submitted by the user in one of the languages supported. This input is fed to a modular
detoxification engine that uses several detoxification models concurrently. Each model processes the
input independently and produces its detoxed equivalent. These outputs are then gathered and assessed
to compare fluency, factuality consistency, and reduction of toxicity. The architecture was made such a
way that it could be easily extended with more LLMs.</p>
        <p>The framework has multiple functional layers. The Input Layer takes raw toxic sentences without
preprocessing. In the Masking Layer, toxic words of the sentence are detected automatically and masked
to maintain sentence structure with a focus on objectionable content. The Model Layer also takes the
original and masked sentences as inputs to multilingual LLMs. The models give detoxified versions of
text that try to remove toxicity while maintaining the original semantic intent.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Detoxification Models and Techniques</title>
        <p>The toxic dataset comprises 9,000 instances in the test set. For every toxic sentence, first of all, the toxic
words are deleted, and they get replaced with the token ([MASK]). This gives a masked version of the
original sentence, keeping the structure intact without ofensive content.</p>
        <p>
          Both the original toxic sentence and its censored counterpart are fed into large language models
(LLMs) with few-shot prompting. The study used multilingual instruction-following LLMs such as
GPT4o-mini[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], which is prompted to generate detoxified outputs that retain the semantic integrity of the
original sentence while eliminating toxicity. These models prompted in a way to produce grammatically
coherent, contextually correct, and non-toxic completions.
        </p>
        <p>This hybrid approach produces efective detoxification across both high-resource and low-resource
languages, without depending upon supervised fine-tuning or parallel corpora. Instead, it uses the
generative capabilities of LLMs through few-shot prompting and contextual understanding. Refer to
Fig.1 and Fig.2 for an overview of the masking and generation pipeline.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Implementation</title>
        <p>
          There are two major steps of our implementation as follows:
• Toxic Word Deletion(lexicon-based): This defines a class called Detoxification, which uses a
multilingual word list to filter out toxic words. The word list can either be loaded from a local file
or taken from the multilingual-toxic-lexicon dataset [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. For non-Chinese text, the sentence is
split into words using spaces. Toxic words are removed, and the remaining words are joined back
together. For Chinese text, the sentence is first broken into words using jieba, then toxic words
are removed. In both cases, the harmful words are replaced with a general token [MASK], and
the cleaned sentences are saved for the next step.
• Few-shot LLM-based Style Transfer Stage (Masked Completion): The original toxic
sentences, along with their corresponding masked versions, are provided to instruction-following
large language models (LLMs) such as GPT-4o-mini, using few-shot prompting. The system
prompt includes clear formatting instructions and example cases to guide the model in replacing
the [MASK] token with appropriate detoxified content. The prompt consists:
– Toxic sentence
– Masked sentence
– Few shot prompting to generate the detoxified output in the original language
The model takes these input texts and generates sentences that are grammatically correct,
contextually relevant, and not ofensive.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation Metrics</title>
      <sec id="sec-4-1">
        <title>4.1. Automatic Evaluation</title>
        <sec id="sec-4-1-1">
          <title>4.1.1. Style Transfer Accuracy</title>
          <p>
            To evaluate the quality of detoxified outputs across languages, TextDetox 2025 shared task used three
major evaluation metrics : Style Transfer Accuracy, Content Preservation, and Fluency.
Style Transfer Accuracy assesses whether the output successfully transfers a toxic sentence into a
nontoxic sentence. For this, we employed a binary toxicity classifier on the basis of the XLM-Roberta-Large
model [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ] that was specifically fine-tuned for toxicity detection. This metric measures how well the
detoxification model modifies the style of the input while removing toxic language.
          </p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. Content Preservation</title>
          <p>
            Content Preservation is the measure of how closely the semantic meaning of the original toxic sentence
is to the detoxified sentence. This is calculated as cosine similarity between LaBSE [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ] embeddings of
input and output sentences. The similarity score is higher, the more the detoxified sentence maintains
the original meaning.
4.1.3. Fluency
Fluency is utilized to approximate the degree to which the produced sentences are natural, grammatically
sound, and coherent. For this purpose, the xCOMET[
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] model is utilized, which has been shown to have
a high correlation with human fluency judgment of detoxified text. The COMET machine translation
models are used as a robust proxy to evaluate the adequacy and linguistic quality of the output.
          </p>
          <p>Where:
 =_(, , )
× (0.4 × (, )</p>
          <p>+ 0.6 × (, ))
×  
  =</p>
          <p>∑︀ _
_ + (_)</p>
          <p>2
_ =  __()
_ = _ ≤  __
 __ =  __()
(1)
(2)
(3)
(4)
(5)</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. LLM-as-a-Judge Evaluation</title>
        <p>For this shared task, the evaluation is also carried out in another way, namely LLM-as-a-Judge, a
popular tool in recent times for the evaluation. In our submissions, this framework is used for the
evaluation, using a LLaMA 3.1-8B-Instruct fine-tuned version, which is trained on human-annotated
pairwise comparisons that have been taken from the TextDetox 2024 dataset. This fine-tuned model
evaluates the results by comparing them in pairs, which would be more human-aligned judgments than
traditional automatic metrics. For the fluency evaluation, the xCOMET model is used as always due to
its strong correlation with human fluency assessments.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>The sections demonstrate a detailed assessment of the multilingual text detoxification models on
their performance in the 15 languages of the Multilingual ParaDetox dataset. The task ofer both
quantitative outcomes based on traditional evaluation measures as well as qualitative examples of model
behavior across varied linguistic and stylistic scenarios. The discussion further comments on model
generalizability implications, and model efectiveness across low-resource languages.</p>
      <sec id="sec-5-1">
        <title>5.1. Quantitative Results</title>
        <p>baseline_mt0
baseline_gpt4
hybrid gpt-4o-mini
baseline_o3mini
baseline_gpt4o
baseline_delete
baseline_backtranslation
baseline_duplicate
Language(s)
Average Score
9 supervised languages
9 supervised languages
9 supervised languages
9 supervised languages
9 supervised languages
9 supervised languages
9 supervised languages
9 supervised languages</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Qualitative Results</title>
        <p>Average
Cross-lingual examples are presented in Table 5 to demonstrate the project’s capability to rephrase
toxic inputs while preserving their original meaning and improving overall fluency. For highly toxic
inputs, some outputs became vague or overly neutral. Using masked inputs helped the model focus
better on toxic segments</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This paper provides a study of multilingual text detoxification, introduced as part of the PAN 2025
shared task. Our hybrid method that integrates lexicon based toxic word masking with few-shot
prompting of multilingual LLMs such as GPT-4o-mini showed robust detoxification performance on
both high-resource domain as well as low-resource languages compared to baseline methods. Semantic
consistency was preserved during the removal of explicit toxic content, as confirmed by strong results
in style transfer accuracy, semantic preservation, and fluency. These outcomes were validated using
both automated evaluation methods and assessments by large language models acting as judges.</p>
      <p>The findings indicate that hybrid models using large generative models are able to generalize across
widely varying linguistic patterns and cultural contexts even without large parallel corpora. This
research provides the foundation for developing robust, scalable, and inclusive content moderation tools
that can engage multilingual digital societies. Future work will focus on enhancing domain adaptation,
managing implicit toxicity, and scaling the model to real-world applications.</p>
      <p>All models were evaluated using three carefully selected metrics to comprehensively assess the
efectiveness of detoxification: Style Transfer Accuracy (quantified with XLM-RoBERTa fine-tuned
for toxicity detection), Content Preservation (cosine similarity of LaBSE embeddings), and Fluency
(with the xCOMET model that agrees well with hu- man ratings on text quality) and also through
LLM-as-a-Judge. A combined score function with these metrics enabled us to compare models fairly
across languages.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author(s) used Grammarly, ChatGPT, and Gemini for language editing and LaTeX table formatting.
The content was reviewed and finalized by the author(s), who take full responsibility for it.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Madhyastha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Founta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Specia</surname>
          </string-name>
          ,
          <article-title>A study towards contextual understanding of toxicity in online conversations</article-title>
          ,
          <source>Natural Language Engineering</source>
          <volume>29</volume>
          (
          <year>2023</year>
          )
          <fpage>1538</fpage>
          -
          <lpage>1560</lpage>
          . doi:
          <volume>10</volume>
          .1017/ S1351324923000414.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Protasov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          , I. Alimova,
          <string-name>
            <given-names>C.</given-names>
            <surname>Brune</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Konovalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Muti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Liebeskind</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Litvak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Shah</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Takeshita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Vanetik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <article-title>Overview of the multilingual text detoxification task at pan 2025</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , D. Spina (Eds.),
          <source>Working Notes of CLEF 2025 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gipp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Greiner-Petter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelmanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , E. Zangerle, Overview of PAN 2025:
          <article-title>Voight-Kampf Generative AI Detection, Multilingual Text Detoxification, Multi-Author Writing Style Analysis, and Generative Plagiarism Detection</article-title>
          , in: J.
          <string-name>
            <surname>C. de Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piroi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Sixteenth International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4] OpenAI, Gpt-4o
          <source>mini: Advancing cost-eficient intelligence</source>
          ,
          <year>2024</year>
          . URL: https://openai.com/index/ gpt-4o
          <article-title>-mini-advancing-cost-eficient-intelligence/</article-title>
          , accessed:
          <fpage>2025</fpage>
          -05-31.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] TextDetox, Multilingual toxic lexicon, https://huggingface.co/datasets/textdetox/multilingual_toxic_ lexicon,
          <year>2023</year>
          . Accessed:
          <fpage>2025</fpage>
          -05-31.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wenzek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          , E. Grave,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Unsupervised cross-lingual representation learning at scale</article-title>
          ,
          <year>2020</year>
          . URL: https://arxiv.org/abs/
          <year>1911</year>
          .02116. arXiv:
          <year>1911</year>
          .02116.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Arivazhagan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Language-agnostic bert sentence embedding</article-title>
          ,
          <year>2022</year>
          . URL: https://arxiv.org/abs/
          <year>2007</year>
          .
          <year>01852</year>
          . arXiv:
          <year>2007</year>
          .
          <year>01852</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N. M.</given-names>
            <surname>Guerreiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rei</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. van Stigt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Coheur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Colombo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. F. T.</given-names>
            <surname>Martins</surname>
          </string-name>
          ,
          <string-name>
            <surname>xcomet:</surname>
          </string-name>
          <article-title>Transparent machine translation evaluation through fine-grained error detection</article-title>
          ,
          <year>2023</year>
          . URL: https://arxiv.org/ abs/2310.10482. arXiv:
          <volume>2310</volume>
          .
          <fpage>10482</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>