=Paper= {{Paper |id=Vol-3681/T6-7 |storemode=property |title=Detecting Hate Speech and Offensive Content in English and Indo-Aryan Texts |pdfUrl=https://ceur-ws.org/Vol-3681/T6-7.pdf |volume=Vol-3681 |authors=Mohammadmostafa Rostamkhani,Sauleh Eetemadi |dblpUrl=https://dblp.org/rec/conf/fire/RostamkhaniE23 }} ==Detecting Hate Speech and Offensive Content in English and Indo-Aryan Texts== https://ceur-ws.org/Vol-3681/T6-7.pdf
                                Detecting Hate Speech and Offensive Content in
                                English and Indo-Aryan Texts
                                Mohammadmostafa Rostamkhani1 , Sauleh Eetemadi2
                                1
                                  Iran University of Science and Technology (IUST University), University of Science and Technology of Iran, University
                                St., Hengam St., Resalat Square, Tehran, Iran
                                2
                                  Iran University of Science and Technology (IUST University), University of Science and Technology of Iran, University
                                St., Hengam St., Resalat Square, Tehran, Iran


                                                                         Abstract
                                                                         In this paper, we address hate speech and offensive content detection for English and Indo-Aryan
                                                                         languages. It is a shared task in hate speech detection for Sinhala and Gujarati (Task 1A and 1B [1, 2, 3]),
                                                                         and English hateful span identification in a text already detected as hateful (Task 3 [4, 5, 6]). The study
                                                                         compares multilingual models on translated text against source language-specific fine-tuned models
                                                                         on source text and evaluates DistilBERT and XLM-RoBERTa for hateful span identification. Results
                                                                         show that fine-tuned source language models excel in hate speech detection, especially with ample high-
                                                                         quality source data. language models with good pre-training data (languages like Sinhala, and English)
                                                                         have superior performance but limited models in languages like Gujarati emphasize XLM-RoBERTa’s
                                                                         advantage against source-language models. This shows the importance of good pre-training data which
                                                                         language models are pre-trained on, for superior hate speech detection. Moreover, XLM-RoBERTa
                                                                         surpasses DistilBERT in identifying hateful spans for Task3. In Task 1A, we ranked 6th out of 16 teams,
                                                                         for Task 1B, we stood 13th among 17 teams, for Task 3, our method achieved 5th place in the public
                                                                         leaderboard (on 30% of test data) and ranked 2nd place in the private leaderboard (70% of test data)
                                                                         among 12 teams. For task 1, our team goes by the name ”NAVICK,” and for task 3, we are identified as
                                                                         ”Mohammadmostafa78”. In Task 1A, our highest achieved metrics include an Accuracy of 83.24, Precision
                                                                         of 84.03, Recall of 83.24, and an F1-score of 82.90. Turning to Task 1B, our best performance stands at a
                                                                         Precision of 70.38, Recall of 73.64, and an F1-score of 69.46. For Task 3, we attain peak results with a
                                                                         Precision of 48.81, Recall of 55.39, F1-score of 51.89, an impressive Accuracy of 90.09, along with a Public
                                                                         F1-score of 44.17 and a Private F1-score of 51.38.

                                                                         Keywords
                                                                         Hate speech, Offensive content detection, English, Indo-Aryan languages, Sinhala, Gujarati, Multilingual
                                                                         models, Translated text, Source language-specific fine-tuned models, DistilBERT, XLM-RoBERTa, Hateful
                                                                         span identification




                                1. Introduction
                                The proliferation of hate speech and offensive content on digital platforms has underscored the
                                urgency of developing effective methods for their detection across languages. we can protect
                                people from offensive content and detect offensive parts of content and censor it. Different

                                Forum for Information Retrieval Evaluation, December 15-18, 2023, India
                                Envelope-Open mo_rostamkhani97@comp.iust.ac.ir (M. Rostamkhani); sauleh@iust.ac.ir (S. Eetemadi)
                                GLOBE https://www.sauleh.ir/ (S. Eetemadi)
                                Orcid 0009-0007-0529-6831 (M. Rostamkhani); 0000-0003-1376-2023 (S. Eetemadi)
                                                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
     (a) System architecture for Task 1                         (b) System architecture for Task 3
Figure 1: systems architecture


variety of methods have been used for hate speech detection tasks such as traditional classifiers
[7, 8, 9, 10], deep learning-based classifiers [11, 12] or the combination of both approaches
[13]. There are also some research for investigation on the importance of initial fine-tuning
multilingual models on English hate speech and subsequently fine-tuning them with labeled
data in the target language [14].
   This paper addresses investigating hate speech identification in both low-resource Indo-
Aryan languages (Sinhala and Gujarati) and English. The study encompasses three key tasks:
classifying tweets as hate/offensive or not (Task 1A and 1B) and detecting hateful spans within
sentences (Task 3). To tackle Task 1, we use two approaches: leveraging translation services in
combination with multilingual models and utilizing language models fine-tuned on the source
languages.
   The code and data associated with our research are made openly available to the community
for further exploration and validation in this GitHub Repository.


2. Background
For Task 1 our model receives tokenized text as input and generates its corresponding class for
hate speech. For task 3 our model receives tokenized text as input and generates corresponding
labels for each token as output.


3. System overview
For task 1 [1, 2, 3], we use different models including English-only, multilingual, and source-
language models. We just translate training and test data and then use the model to predict its
label.
   For the task of hate speech classification (Task 1A and 1B), our methodology begins with an
examination of the label distribution for each class. Task 1A distribution displays a class ratio of
approximately 4:3, while Task 1B presents a balanced 1:1 ratio. While a minor class imbalance
exists in the dataset for task 1A, the adoption of oversampling or undersampling techniques
is deemed unnecessary. For this task, we compare two different strategies. The first, uses a
translation-based technique, employing the Google Translate API to convert content in the
source language into English, subsequently subjecting translations to multilingual models (as
an English text), An English-only model which pre-trained on hate speech corpus. the second
approach uses models fine-tuned on the source languages.
   For Task 3 [4, 5, 6], which pertains to identifying hateful spans in English sentences, we use
the BIO notation for sequence labeling. For labeling words with more than one token, we assign
     (a) Label distribution for Task 1A                       (b) Label distribution for Task 1B
Figure 2: Label distribution


the label of the word to the first token and ignore the remaining tokens. This approach prevents
model bias towards lengthy words that have more than one token. The remaining tokens of a
word except the first one, are assigned a value of -100, for ignoring them in the loss function
and mitigating their impact. we use DistilBERT [15] and XLM-RoBERTa [16] Base models for
this task and compare their results. In our system overview for task 3, we outline the key steps
and components of our approach for hate span detection.

3.1. Tokenization
We start by tokenizing the input text, and breaking it down into individual tokens.

3.2. Label Assignment for Multi-Token Words
To handle words consisting of more than one token, we adopt a strategy where we assign
the label of the entire word to its initial token. Any subsequent tokens from the same word
are assigned a special value of -100, effectively excluding them from the loss function during
training to mitigate their impact.

3.3. Model Selection
We employ two distinct transformer-based models: DistilBERT and XLM-RoBERTa , to explore
their performance in hate span detection.

3.4. Data Split
We use HASOC Task3 dataset [17] as our main dataset. For validation purposes, we partition
10% of our dataset, reserving the remaining 90% for model training.

3.5. Performance Metrics
Throughout the training process, we monitor and report key evaluation metrics for each epoch,
including precision, recall, F1 score, and accuracy. These metrics help us assess the model’s
progress and effectiveness.
3.6. Training Procedure
We utilize the Hugging Face Trainer, a powerful training framework, to facilitate the training
and evaluation of our models. This framework streamlines the training process and provides
insights into model performance.

3.7. Model Selection Based on Validation Loss
At the conclusion of training, we select the model with the lowest validation loss as our final
model. This ensures that we choose the model configuration that optimizes performance.

3.8. Test Data Preparation
Prior to making predictions on the test dataset, we apply the same preprocessing steps as used
during training to ensure consistency and fairness in our evaluation.

3.9. Prediction Generation
We employ the selected model to generate predictions for our test data, detecting hateful spans
within the test sentences.


4. Experimental setup
In our experimental setup, we carefully configured the parameters to train and evaluate our
hate span detection models. The following details provide insight into our setup:

4.1. Data Split
We partitioned our dataset into two subsets: 90% for training and 10% for validation. This
allowed us to train our models on a substantial portion of the data while reserving a separate
subset for assessing their performance.

4.2. Training Configuration
We conducted training for a total of 5 epochs. Through experimentation, we determined that the
optimal model performance was achieved at epoch 2, which we identified as the ”best epoch.”

4.3. Learning Rate
The learning rate used during training was set to 2e-5. This rate helps govern the step size
taken during the optimization process, influencing the model’s convergence and performance.

4.4. Weight Decay
To control overfitting and fine-tune model parameters, we applied a weight decay of 0.01. This
regularization technique helps prevent excessive parameter updates during training.
4.5. Batch Sizes
During training, we utilized a batch size of 16 for both the training and evaluation phases. Batch
sizes influence the efficiency of the training process and can impact memory usage.


                                      Hyperparameter           Value
                                      Train-Test Split         90% - 10%
                                      Max Epoch                5
                                      Best Epoch               2
                                      Learning Rate            2 × 10−5
                                      Weight Decay             0.01
                                      Batch Size               16
Table 1
Hyperparameter Settings


By carefully configuring these parameters and splitting our data into training and validation
sets, we aimed to ensure a robust and well-tuned training process, ultimately leading to the
selection of the best-performing model for hate span detection.


5. Results
In Task 1A, experiments demonstrated that utilizing models fine-tuned on the source languages
outperformed the translation-based approach. This can be attributed to the preservation of
linguistic nuances and contextual understanding inherent in language-specific models, as well
as the absence of a proficient language model for correct and accurate translations also some
issues present within the translated sentences. For example, ”@USER” in some translations
changed and in some others did not. To prevent this issue, we could remove ”@USER” totally.
The fine-tuned models on source language have higher precision and recall and F1-score in all
cases for identifying hate speech and offensive content.
 Language                     Model                    Loss    Test Accuracy   Test Precision   Test Recall   Test F1
 En-                 distilbert-base-uncased          0.6252       0.6408          0.6231         0.6132      0.6144
 glish         twitter-xlm-roberta-base-sentiment     0.5484       0.6416          0.6282         0.6278      0.6280
            roberta-hate-speech-dynabench-r4-target   0.6317       0.6568          0.6422         0.6223      0.6228
 Sin-                     SinhalaBERTo                0.4169       0.8228          0.8216         0.8075      0.8126
 hala                     xlm-t-hasoc-hi              0.4199       0.8244          0.8267         0.8244      0.8208
                      xlm-t-hasoc-hi-sold-si          0.4116      0.8324          0.8403          0.8324      0.8272
                           xlm-t-sold-si              0.4284       0.8316          0.8326         0.8316      0.8290

Table 2
Results of Task 1A (hate speech detection for Sinhala)

  In the evaluation of hateful span detection, we assessed both DistilBERT and XLM-RoBERTa
models. The outcomes highlighted XLM-RoBERTa’s effectiveness in identifying hateful spans
within sentences, attaining superior F1 scores. This underscores the significance of harnessing
pre-trained models explicitly engineered for cross-lingual and contextual comprehension tasks.
  Language                         Model                                   Precision        Recall       F1-Score
  Gu-                           gujarati-bert                               0.6917          0.6835        0.6870
  jarati                       Gujarati-Model                               0.3383          0.3843        0.3577
                            Gujarati-XLM-R-Base                             0.3428          0.5000        0.4067
  English            twitter-xlm-roberta-base-sentiment                     0.7038          0.7364        0.6946
Table 3
Results of Task 1B (hate speech detection for Gujarati)

          Model            Val Loss   Val Precision   Val Recall   Val F1-Score   Val Accuracy   Public F1   Private F1
 distilbert-base-uncased    0.3184        0.4712        0.5221        0.4953          0.8906      0.4389       0.4941
    xlm-roberta-base        0.2877       0.4881        0.5539         0.5189         0.9003       0.4417       0.5138

Table 4
Results of Task 3 (hate span identification)


  The organizers have implemented two sets of metrics and leaderboards: the public leaderboard,
which evaluates performance using roughly 30% of the test data, and the private leaderboard,
which utilizes approximately 70% of the test data for evaluation.
On the public leaderboard, we achieved a ranking of 5th, whereas on the private leaderboard,
we secured the 2nd position.


6. Conclusion
This paper investigates hate speech detection in English and Indo-Aryan languages, showcasing
results from Sinhala and Gujarati tasks (Task 1A and 1B), as well as English hateful span
identification (Task 3). Comparing translation-based multilingual models and language-specific
fine-tuned models, it evaluates DistilBERT and XLM-RoBERTa for hateful span identification.
Fine-tuned source language models excel in hate speech detection, particularly with ample high-
quality source data, benefiting languages like Sinhala. The scarcity of models in languages like
Gujarati highlights XLM-RoBERTa’s advantage. This underscores tailored data and language
models’ significance. Moreover, XLM-RoBERTa outperforms DistilBERT in identifying hateful
spans, accentuating language-specific models’ importance in advancing cross-lingual processing.


References
 [1] T. Ranasinghe, I. Anuradha, D. Premasiri, K. Silva, H. Hettiarachchi, L. Uyangodage,
     M. Zampieri, Sold: Sinhala offensive language dataset, arXiv preprint arXiv:2212.00851
     (2022).
 [2] S. Satapara, H. Madhu, T. Ranasinghe, A. E. Dmonte, M. Zampieri, P. Pandya, N. Shah,
     M. Sandip, P. Majumder, T. Mandl, Overview of the hasoc subtrack at fire 2023: Hate-
     speech identification in sinhala and gujarati, in: K. Ghosh, T. Mandl, P. Majumder, M. Mitra
     (Eds.), Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, Goa,
     India. December 15-18, 2023, CEUR Workshop Proceedings, CEUR-WS.org, 2023.
 [3] T. Ranasinghe, K. Ghosh, A. S. Pal, A. Senapati, A. E. Dmonte, M. Zampieri, S. Modha,
     S. Satapara, Overview of the HASOC subtracks at FIRE 2023: Hate speech and offensive
     content identification in assamese, bengali, bodo, gujarati and sinhala, in: Proceedings of
     the 15th Annual Meeting of the Forum for Information Retrieval Evaluation, FIRE 2023,
     Goa, India. December 15-18, 2023, ACM, 2023.
 [4] S. Masud, M. A. Khan, M. S. Akhtar, T. Chakraborty, Overview of the HASOC Subtrack
     at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span
     Detection, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation,
     CEUR, 2023.
 [5] K. Ghosh, A. Senapati, A. S. Pal, Annihilate Hates (Task 4, HASOC 2023): Hate Speech
     Detection in Assamese, Bengali, and Bodo languages, in: Working Notes of FIRE 2023 -
     Forum for Information Retrieval Evaluation, CEUR, 2023.
 [6] S. Satapara, S. Masud, H. Madhu, M. A. Khan, M. S. Akhtar, T. Chakraborty, S. Modha,
     T. Mandl, Overview of the HASOC subtracks at FIRE 2023: Detection of hate spans and
     conversational hate-speech, in: Proceedings of the 15th Annual Meeting of the Forum
     for Information Retrieval Evaluation, FIRE 2023, Goa, India. December 15-18, 2023, ACM,
     2023.
 [7] M. W. M. Thomas Davidson, Dana Warmsley, I. Weber, Automated hate speech detection
     and the problem of offensive language, CoRR abs/1703.04009 (2017). URL: http://arxiv.org/
     abs/1703.04009.
 [8] Z. Waseem, D. Hovy, Hateful symbols or hateful people? predictive features for hate
     speech detection on Twitter, in: Proceedings of the NAACL Student Research Workshop,
     Association for Computational Linguistics, San Diego, California, 2016, pp. 88–93. URL:
     https://aclanthology.org/N16-2013. doi:10.18653/v1/N16- 2013 .
 [9] Y. H. R. Y. E. R. K. G. N. MacAvaney, S., O. Frieder, Hate speech detection: Challenges and
     solutions, PloS one (2019).
[10] E. W. Pamungkas, V. Basile, V. Patti, Misogyny detection in twitter: a multilingual
     and cross-domain study, Information Processing Management 57 (2020) 102360. URL:
     https://www.sciencedirect.com/science/article/pii/S0306457320308554. doi:https://doi.
     org/10.1016/j.ipm.2020.102360 .
[11] S. Agrawal, A. Awekar, Deep learning for detecting cyberbullying across multiple so-
     cial media platforms, CoRR abs/1801.06482 (2018). URL: http://arxiv.org/abs/1801.06482.
     arXiv:1801.06482 .
[12] P. Badjatiya, S. Gupta, M. Gupta, V. Varma, Deep learning for hate speech de-
     tection in tweets, CoRR abs/1706.00188 (2017). URL: http://arxiv.org/abs/1706.00188.
     arXiv:1706.00188 .
[13] Z. Mossie, J.-H. Wang, Vulnerable community identification using hate speech detection
     on social media, Information Processing Management 57 (2020) 102087. URL: https:
     //www.sciencedirect.com/science/article/pii/S0306457318310902. doi:https://doi.org/
     10.1016/j.ipm.2019.102087 .
[14] P. Röttger, D. Nozza, F. Bianchi, D. Hovy, Data-efficient strategies for expanding hate
     speech detection into under-resourced languages, 2022. arXiv:2210.11359 .
[15] V. Sanh, L. Debut, J. Chaumond, T. Wolf, Distilbert, a distilled version of BERT: smaller,
     faster, cheaper and lighter, CoRR abs/1910.01108 (2019). URL: http://arxiv.org/abs/1910.
     01108. arXiv:1910.01108 .
[16] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave,
     M. Ott, L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation
     learning at scale, CoRR abs/1911.02116 (2019). URL: http://arxiv.org/abs/1911.02116.
     arXiv:1911.02116 .
[17] S. Masud, M. Bedi, M. A. Khan, M. S. Akhtar, T. Chakraborty, Proactively reducing the
     hate intensity of online posts via hate speech normalization, in: Proceedings of the
     28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22,
     Association for Computing Machinery, New York, NY, USA, 2022, p. 3524–3534. URL:
     https://doi.org/10.1145/3534678.3539161. doi:10.1145/3534678.3539161 .