<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>of Transformer Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Md Saroar Jahan</string-name>
          <email>saroar.jahan@huawei.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fadi Hassan</string-name>
          <email>fadi.hassan@huawei.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Walid Aransa</string-name>
          <email>walid.aransa@huawei.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abdessalam Bouchekif</string-name>
          <email>abdessalam.bouchekif@huawei.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Hateful Span Detection, Conversational Hate Detection, SpanBERT</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Huawei Finland Research Center</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <fpage>5</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>The classification of hate speech and ofensive language presents significant challenges, primarily due to the scarcity of low-resource datasets and the absence of pre-trained models. This paper ofers a comprehensive overview of ofensive language identification results in the context of HASOC-2023 across various languages and tasks, including Sinhala and Gujarati, Bengali, Assamese, and Bodo, and Hateful span detection. To address these challenges, we harnessed the power of BERT-based models, leveraging resources such as XLM-RoBERTa-large, l3-cube, BanglaHateBert, and BenglaBERT. Our research findings yielded promising results, notably showcasing the superior performance of XLM-RoBERTa-large over monolingual models in the majority of cases. For Task 3, SpanBERT performed outstandingly.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Social media is a widely popular and convenient platform for open expression and online
communication with others. Unfortunately, it also provides the means for distributing abusive
and aggressive content such as sexism, racism, politics, cyberbullying, and blackmailing.
Nockleby [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] stated that ”hate speech disparages a person or group based on some characteristics
such as race, color, and ethnicity”. Addressing ofensive language on social media is now a major
challenge. Various shared tasks and data-sharing initiatives within the research community
aim to motivate researchers to develop innovative solutions for detecting abusive content.
Among the initiatives, HASOC has gained significant popularity, with its previous editions:
HASOC-2019 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], HASOC-2020[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], HASOC-2021[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and HASOC-2022[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. These editions focus
on Hate speech and ofensive language identification in English, German, and Hindi. SemEval is
another noteworthy initiative. SemEval-2019 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] focuses on the detection of hate speech against
immigrants and women in Spanish and English messages extracted from Twitter.
SemEvalSemEval-2023 [8], the focus is on detecting and identifying comments and tweets containing
CEUR
Workshop
Proceedings
sexist expressions. Additionally, other shared tasks are proposed, such as GermEval [9] for the
German language, EVALITA[10] for Italian languages, and OSACT [11] for Arabic content, all
of which contribute to this important area of research.”
      </p>
      <p>The best models developed for including models such as Roberta [12], DeBERTa [13], ALBERT
[14] and XLM-RoBERTa [15]. In SemEval 2023 task 10-A, the top-performing models [16] and
[17] are based on DeBERTa. The best performing models in SemEval 2022 task 12-A [18]
used an ensemble of ALBERT models of diferent sizes, while the second-ranked team [ 19]
used Roberta-base and XLM-Roberta. Monolingual Transformers give better results when
addressing challenges related to low-resource languages, compared to Multilingual. The winning
team in SemEval-2020 Task 12-A for Arabic [20] and Danish [21] languages achieved highest
performance by using AraBERT [22] and Nordic BERT 1, respectively.</p>
      <p>Hate speech detection becomes more challenging when social posts are written in a Code-Mixed
(CM) language. Code-mixing, the practice of blending words from two languages within a
single sentence, is becoming increasingly common in various bilingual communities, which
renders the automatic making detection task more challenging [23, 24]. In HASOC-2022, three
tasks were hosted: Task 1 and Task 2 involved binary and multi-class classification for both
German and code-mixed languages, while Task 3 focused on identifying ofensive language in
Marathi. The highest performance in Task 1 [25] was achieved using Google-MuRIL 2 (BERT
model pre-trained on 17 Indian languages). HASOC 2023 introduced Task 1 and Task 4, focusing
on the detection of hate speech, ofensive content, and profanity. Task 3 centered on detecting
hate speech spans within social media posts. We actively engaged in this competition, taking
on Task 1 for languages including Bengali, Gujarati, Sinhala, Assamese, and Bodo, additionally,
we took on Task 3 which involved English hate speech span detection. To accomplish these
tasks, we made use of the HASOC-2023 shared dataset for both training and validation purposes
without any external data.</p>
      <p>Our strategy predominantly relied on cutting-edge transformer models to tackle these
challenges. This paper is structured as follows: Section 2 presents a detailed description
of the tasks and datasets, Section 3, we provides an in-depth look at our methodology and model
architecture. Lastly, the conclusion section ofers definitive statements and delineates potential
directions for future research.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Task Description</title>
      <p>This section presents the task descriptions for HASOC 2023 [26] as follows:</p>
      <p>Task 1 and 4 focus on identifying hate speech, ofensive language, and profanity in diferent
languages using natural language processing techniques [27, 28, 29, 30, 31, 32, 33]. These task
mainly involves classifying tweets into two categories: Hate and Ofensive (HOF) or Non-Hate
and Ofensive (NOT).</p>
      <p>• Task 1A: deals with identifying hate and ofensive content in Sinhala, a low-resource</p>
      <p>Indo-Aryan language spoken in Sri Lanka.</p>
      <sec id="sec-3-1">
        <title>1https://github.com/certainlyio/nordic_bert 2https://huggingface.co/google/muril-base-cased</title>
        <p>Task 3 aims to detect the various hateful spans within a sentence already considered hateful
[34]. The input texts are all in English. The detection of hateful spans is achieved by mapping
this into a sequence labeling problem. For every token of the sequences, human annotators have
manually annotated the start and end of a hateful span. This is achieved by the BIO notation
tagging, where ’B’ represents the beginning of the hate span,’ I’ forms the continuation of a
hate span, and ’O’ represents the non-hate tag.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Methodology</title>
      <p>This section ofers a comprehensive overview encompassing the model architecture description
and the strategies employed to address each task. Due to the similarities between Task 1 and
Task 4, we have consolidated them into a single section, while Task 3 are separately described.</p>
      <sec id="sec-4-1">
        <title>3.1. Task 1 and 4 Model Architecture</title>
        <sec id="sec-4-1-1">
          <title>For Task 1 and Task 4, we adopted two main strategies:</title>
          <p>1. Utilizing Diferent BERT Models: We conducted experiments with both multilingual and
monolingual BERT models.
2. Augmenting Training Data: Our second approach involved enhancing the training data
through automatic annotation.</p>
          <p>We assessed several models, including multilingual ones such as XLM-RoBERTa-large and
IndicBERT, and monolingual models like L3-cube, Bangla BERT, and Bangla Hate BERT.
Train and
Test size
1281, 320
Following our experiments, we selected XLM-RoBERTa-large as the baseline model due to
its superior performance when compared to all the monolingual models. This performance
diference may be attributed to the age of some monolingual models, such as Bangla Hate BERT,
which is considerably older compared to XLM-RoBERTa-large. However, for the Bangla L3-cube
monolingual model, it exhibited a slightly better performance by +0.06 F1 score compared to
XLM-RoBERTa-large. Nevertheless, since XLM-RoBERTa-large outperformed most monolingual
models in most cases, we opted to choose it as the baseline model.</p>
          <p>To further enhance our model’s performance, we pursued a second strategy, which involved
expanding our training dataset. However, due to a lack of suitable datasets for most of these
languages, we hypothesized that incorporating automatically annotated test data into the
training data could improve model learning. To implement this, we initially trained the model
using 90% of the training data and 10% of the evaluation data. We then determined the optimal
thresholds during evaluation and applied upper and lower thresholds to automatically annotate
part of the test data. For example, we used a 0.90 upper threshold and a 0.20 lower threshold.
After automatically annotating that portion of test data with these thresholds, we retrained
the model by adding this part of test data to the training data and observed a 3% improvement
in model performance. This hypothesis was further tested with external public data, where
automatic annotations were applied using these upper and lower thresholds, resulting in a 1-2%
improvement. Additionally, we also employed the ensemble of 5 models, which contributed to
a 0.4% increase in F1 scores (see Table 3).</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Task 3 Model Architecture</title>
        <p>In Task 3, the goal is to find all the hateful spans. A hateful span is a group of words that together
express the hatred in the sentence. In this task, the provided data size: 1936 training samples,
485 validation samples, and 606 test samples, and the specific labels used for this task, such as
”B-HateSpan” to denote the first token in a hateful span and ”I-HateSpan” to indicate tokens
inside a hateful span. Additionally, the section includes an analysis of the model’s performance
and an in-depth explanation of the outcomes.</p>
        <p>The architecture of our best submitted model for Task 3 employs a teacher-student framework
and utilizes the SpanBERT-base-cased model along with Conditional Random Fields (CRF) for
sequence tagging. The approach can be summarized as follows:
• Teacher Model: Ensemble of  SpanBERT-base-cased models, each combined with CRF.
• Student Model: A single SpanBERT-base-cased model, also integrated with CRF. The
student model is distilled from the teacher model using a specific formula:
ℒloss = (1 − ) ⋅ CE(
_ ,  ) +  ⋅</p>
        <p>MSE(
_, ℎ
_)
(1)
1. Model Comparison: Table 5 provides a comparison of diferent base models with
varying configurations, including casing, k-fold cross-validation, and tagging schemes
(BIO and IO). It is observed that SpanBERT-large with lower casing and a 5-fold
crossvalidation scheme achieved the highest private score of 62.322, indicating its efectiveness
in identifying hateful spans.
2. Impact of Casing: The casing of the model input, whether lower case or true case, seems
to afect the model’s performance. Lower casing generally performs better, as indicated
by the higher private and public scores in several configurations.
3. Tagging Scheme: The choice of tagging scheme (BIO vs. IO) also influences performance.</p>
        <p>Models using the BIO tagging scheme tend to yield better results, as seen in higher private
and public scores.
4. Ensemble vs. Single Model: The ensemble approach using multiple
SpanBERT-basecased models as teachers seems to provide valuable knowledge transfer to the student
model, resulting in improved performance.
5. Distillation Efect: The use of distillation with an  value of 0.95 for transferring
knowledge from teachers to the student model helps enhance performance compared to
a standalone student model (See Eq.1).</p>
        <p>Overall, the model architecture involving an ensemble of SpanBERT models with CRF,
especially when using lower casing and BIO tagging, demonstrates strong performance in
identifying hateful spans in text. The distillation process further boosts the student model’s
efectiveness.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion</title>
      <p>In this paper, we have presented a comprehensive analysis of hate speech and ofensive language
identification across multiple languages and tasks in HASOC-2023 competition. In Task 1 and
Task 4, our research involves identifying ofensive language in Sinhala, Gujarati, Bengali,
Assamese, and Bodo languages, and Task 3, which involves hateful span detection in English
text.</p>
      <p>Our research not only showcased the efectiveness of transformer-based models in these
shared tasks but also emphasized the importance of model selection, task-specific customization,
and innovative strategies to address the challenges posed by low resource languages, multilingual
and cross-lingual contexts.</p>
      <p>As future work, further investigations are needed to explore the use of more diverse and
specialized transformer models, as well as fine-tuning model parameters to achieve even better
results. Additionally, we need to inspect the application of ensemble techniques and the
incorporation of multiple thresholds for automatic annotation represents promising avenues
for improving model robustness and generalization.
language identification in social media (ofenseval 2020), arXiv preprint arXiv:2006.07235
(2020).
[8] H. Kirk, W. Yin, B. Vidgen, P. Röttger, Semeval-2023 task 10: Explainable detection of online
sexism, in: Proceedings of the The 17th International Workshop on Semantic Evaluation,
SemEval@ACL 2023, Toronto, Canada, 13-14 July 2023, Association for Computational
Linguistics, 2023, pp. 2193–2210.
[9] M. Wiegand, M. Siegel, J. Ruppenhofer, Overview of the germeval 2018 shared task on the
identification of ofensive language (2018).
[10] C. Bosco, D. Felice, F. Poletto, M. Sanguinetti, T. Maurizio, Overview of the evalita 2018
hate speech detection task, in: EVALITA 2018-Sixth Evaluation Campaign of Natural
Language Processing and Speech Tools for Italian, volume 2263, CEUR, 2018, pp. 1–9.
[11] H. Mubarak, H. Al-Khalifa, A. Al-Thubaity, Overview of OSACT5 shared task on Arabic
ofensive language and hate speech detection, in: Proceedinsg of the 5th Workshop on
Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur’an QA and
Fine-Grained Hate Speech Detection, European Language Resources Association, 2022, pp.
162–166.
[12] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer,
V. Stoyanov, Roberta: A robustly optimized BERT pretraining approach, CoRR
abs/1907.11692 (2019).
[13] P. He, J. Gao, W. Chen, Debertav3: Improving deberta using electra-style pre-training with
gradient-disentangled embedding sharing, CoRR abs/2111.09543 (2021).
[14] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, R. Soricut, ALBERT: A lite BERT for
self-supervised learning of language representations, CoRR abs/1909.11942 (2019).
[15] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave,
M. Ott, L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at
scale, CoRR abs/1911.02116 (2019).
[16] M. Zhou, PingAnLifeInsurance at SemEval-2023 task 10: Using multi-task learning to better
detect online sexism, in: Proceedings of the 17th International Workshop on Semantic
Evaluation (SemEval-2023), 2023.
[17] F. Hassan, A. Bouchekif, W. Aransa, Firc at semeval-2023 task 10: Fine-grained classification
of online sexism content using deberta, in: Proceedings of the The 17th International
Workshop on Semantic Evaluation, SemEval@ACL 2023, Toronto, Canada, 13-14 July 2023,
Association for Computational Linguistics, 2023, pp. 1824–1832.
[18] G. Wiedemann, S. M. Yimam, C. Biemann, UHH-LT at SemEval-2020 task 12: Fine-tuning
of pre-trained transformer networks for ofensive language detection, in: Proceedings of
the Fourteenth Workshop on Semantic Evaluation, 2020.
[19] S. Wang, J. Liu, X. Ouyang, Y. Sun, Galileo at SemEval-2020 task 12: Multi-lingual learning
for ofensive language identification using pre-trained language models, in: Proceedings
of the Fourteenth Workshop on Semantic Evaluation, 2020.
[20] H. Alami, S. Ouatik El Alaoui, A. Benlahbib, N. En-nahnahi, LISAC FSDM-USMBA team at
SemEval-2020 task 12: Overcoming AraBERT’s pretrain-finetune discrepancy for Arabic
ofensive language identification, in: Proceedings of the Fourteenth Workshop on Semantic
Evaluation, 2020.
[21] M. Pàmies, E. Öhman, K. Kajava, J. Tiedemann, LT@Helsinki at SemEval-2020 task 12:
Multilingual or language-specific BERT?, in: Proceedings of the Fourteenth Workshop on
Semantic Evaluation, 2020.
[22] W. Antoun, F. Baly, H. Hajj, AraBERT: Transformer-based model for Arabic language
understanding, in: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and
Processing Tools, with a Shared Task on Ofensive Language Detection, European Language
Resource Association, Marseille, France, 2020, pp. 9–15. URL: https://aclanthology.org/
2020.osact-1.2.
[23] I. A. Bhat, V. Mujadia, A. Tammewar, R. A. Bhat, M. Shrivastava, Iiit-h system submission for
ifre2014 shared task on transliterated search, in: Proceedings of the Forum for Information
Retrieval Evaluation, 2014, pp. 48–53.
[24] M. L. Ripoll, F. Hassan, J. Attieh, G. Collell, A. Bouchekif, Multi-lingual contextual hate
speech detection using transformer-based ensembles, in: Forum for Information Retrieval
Evaluation (Working Notes)(FIRE). CEUR-WS. org, 2022.
[25] N. K. Singh, U. Garain, An analysis of transformer-based models for code-mixed
conversational hate-speech identification, in: Forum for Information Retrieval Evaluation
(Working Notes)(FIRE). CEUR-WS. org, 2022.
[26] S. Masud, M. Bedi, M. A. Khan, M. S. Akhtar, T. Chakraborty, Proactively reducing the
hate intensity of online posts via hate speech normalization, in: Proceedings of the
28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22,
Association for Computing Machinery, New York, NY, USA, 2022, p. 3524–3534. URL:
https://doi.org/10.1145/3534678.3539161. doi:10.1145/3534678.3539161.
[27] S. Satapara, H. Madhu, T. Ranasinghe, A. E. Dmonte, M. Zampieri, P. Pandya, N. Shah,
M. Sandip, P. Majumder, T. Mandl, Overview of the hasoc subtrack at fire 2023:
Hatespeech identification in sinhala and gujarati, in: K. Ghosh, T. Mandl, P. Majumder, M. Mitra
(Eds.), Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, Goa,
India. December 15-18, 2023, CEUR Workshop Proceedings, CEUR-WS.org, 2023.
[28] K. Ghosh, A. Senapati, A. S. Pal, Annihilate Hates (Task 4, HASOC 2023): Hate Speech
Detection in Assamese, Bengali, and Bodo languages, in: Working Notes of FIRE 2023
Forum for Information Retrieval Evaluation, CEUR, 2023.
[29] S. Satapara, S. Masud, H. Madhu, M. A. Khan, M. S. Akhtar, T. Chakraborty, S. Modha,
T. Mandl, Overview of the HASOC subtracks at FIRE 2023: Detection of hate spans and
conversational hate-speech, in: Proceedings of the 15th Annual Meeting of the Forum
for Information Retrieval Evaluation, FIRE 2023, Goa, India. December 15-18, 2023, ACM,
2023.
[30] T. Ranasinghe, K. Ghosh, A. S. Pal, A. Senapati, A. E. Dmonte, M. Zampieri, S. Modha,
S. Satapara, Overview of the HASOC subtracks at FIRE 2023: Hate speech and ofensive
content identification in assamese, bengali, bodo, gujarati and sinhala, in: Proceedings of
the 15th Annual Meeting of the Forum for Information Retrieval Evaluation, FIRE 2023,
Goa, India. December 15-18, 2023, ACM, 2023.
[31] K. Ghosh, A. Senapati, U. Garain, Baseline bert models for conversational hate speech
detection in code-mixed tweets utilizing data augmentation and ofensive language
identification in marathi, in: Fire, 2022. URL: https://api.semanticscholar.org/CorpusID:
259123570.
[32] K. Ghosh, D. A. Senapati, Hate speech detection: a comparison of mono and multilingual
transformer model with cross-language evaluation, in: Proceedings of the 36th Pacific Asia
Conference on Language, Information and Computation, De La Salle University, Manila,
Philippines, 2022, pp. 853–865. URL: https://aclanthology.org/2022.paclic-1.94.
[33] K. Ghosh, D. Sonowal, A. Basumatary, B. Gogoi, A. Senapati, Transformer-based hate
speech detection in assamese, in: 2023 IEEE Guwahati Subsection Conference (GCON),
2023, pp. 1–5. doi:10.1109/GCON58516.2023.10183497.
[34] S. Masud, M. A. Khan, M. S. Akhtar, T. Chakraborty, Overview of the HASOC Subtrack
at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span
Detection, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation,
CEUR, 2023.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Nockleby</surname>
          </string-name>
          , Hate speech,
          <source>Encyclopedia of the American constitution 3</source>
          (
          <year>2000</year>
          )
          <fpage>1277</fpage>
          -
          <lpage>1279</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mandlia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc track at fire 2019: Hate speech and ofensive content identification in indo-european languages</article-title>
          ,
          <source>in: Proceedings of the 11th forum for information retrieval evaluation</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>14</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Kumar</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc track at ifre 2020: Hate speech and ofensive language identification in tamil, malayalam, hindi, english and german</article-title>
          ,
          <source>in: Forum for Information Retrieval Evaluation</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>29</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the HASOC subtrack at FIRE 2021: Hate speech and ofensive content identification in english and indo-aryan languages and conversational hate speech</article-title>
          ,
          <source>in: FIRE</source>
          <year>2021</year>
          :
          <article-title>Forum for Information Retrieval Evaluation, Virtual Event</article-title>
          , India,
          <source>December 13 - 17</source>
          ,
          <year>2021</year>
          , ACM,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC subtrack at FIRE 2022: Identification of conversational hate-speech in hindi-english codemixed and german language</article-title>
          , in: Working Notes of FIRE 2022 -
          <article-title>Forum for Information Retrieval Evaluation, Kolkata</article-title>
          , India, December 9-
          <issue>13</issue>
          ,
          <year>2022</year>
          , volume
          <volume>3395</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>475</fpage>
          -
          <lpage>488</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Malmasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rosenthal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Farra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , Semeval
          <article-title>-2019 task 6: Identifying and categorizing ofensive language in social media (ofenseval</article-title>
          ), arXiv preprint arXiv:
          <year>1903</year>
          .
          <volume>08983</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rosenthal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Atanasova</surname>
          </string-name>
          , G. Karadzhov,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mubarak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Derczynski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pitenis</surname>
          </string-name>
          , Ç. Çöltekin, Semeval-2020 task 12: Multilingual ofensive
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>