<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Impact of Data Augmentation on Hate Speech Detection in Roman Urdu</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fariha Maqbool</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Blerina Spahiu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Maurino</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dipartimento di informatica, sistemistica e comunicazione University of Milano-Bicocca Viale Sarca 336</institution>
          ,
          <addr-line>20126 Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The prevalence of hate speech leads to an increase in hate crimes, online violence, and serious harm to social safety, physical security, and cyberspace. To address this issue, several studies have been conducted on hate speech detection in European languages, whereas little attention has been paid to low-resource South Asian languages, making social media vulnerable for millions of users. Due to the scarcity of the datasets and the samples available, there is a need to apply some strategies to increase the data samples. In this paper, we improved the performance of the already fine-tuned m-Bert model by applying data augmentation techniques to one of the datasets on hate speech on tweets in Roman Urdu language. F1-score and accuracy matrix have been used to compare the results. We also experiment to determine the optimal percentage of augmented data to be included and the percentage of words augmented in each instance of data. The new RUHSOLD++ Dataset containing the augmented data has also been published publicly. The improvement in hate speech detection of the model proved that the performance of the models can be improved by applying data augmentation techniques to the dataset with a limited number of instances.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The exponential growth of social media platforms like Facebook1, x (formally Twitter)2, and
YouTube3 has provided a global stage for individuals from diverse cultures and social
backgrounds to communicate and share their opinions on a myriad of topics. While these platforms
uphold the principle of freedom of speech, some users also negatively exploit this freedom and
abuse other users on the basis of gender, religion or race. This surge in harmful content has
underscored the need for increased research in natural language processing (NLP) to efectively
detect instances of hate speech. The consequences faced by victims of targeted hate speech
are not limited to physical harm; they also experience a profound sense of dread and rejection
within their communities. Recognizing the urgency to create online spaces free of racism and
hate speech, researchers emphasize the importance of early detection mechanisms to mitigate
the pervasive harm caused by such content [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This challenge extends beyond the English
language, as millions of users worldwide employ diverse languages as vehicles for spreading
hate. Despite extensive research in English, there exists a noticeable dearth of datasets and
studies on languages like Urdu. Urdu, spoken by over 170 million people globally, faces unique
challenges in written communication, with an alphabet that cannot be easily mapped onto an
English keyboard. The Urdu language comprises 40 characters, yet English keyboards,
accommodating only 26 letters, face limitations in fully supporting Urdu script. Attempting to map
the Urdu alphabet directly onto an English keyboard proves impractical due to these constraints.
Consequently, the predominant approach among Urdu speakers involves the use of Roman Urdu
particularly on social media platforms, Roman Urdu is a transliteration version of Urdu using
English letters. The use of Roman Urdu has sharply expanded as a result of social media’s rising
adoption[
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. Users of these tools regularly utilize these platforms to share their opinions
about a variety of products, services, politics, and other items. However, despite its widespread
use, Roman Urdu (RU) encounters challenges such as a lack of linguistic resources, annotated
datasets, and dedicated language models [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. To address these limitations and enhance model
performance, researchers have explored data augmentation techniques[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In a notable
application, simple data augmentation techniques were employed on a low-resourced language dataset
with a limited number of samples. The results demonstrated a significant improvement in model
performance, underscoring the potential of augmentation in mitigating the challenges posed by
scarce linguistic resources. Additionally, experiments aimed to identify the optimal percentage
of augmented data to be integrated with the original dataset, aiming to boost model performance
while minimizing training time. This multifaceted approach contributes to ongoing eforts to
combat hate speech across various languages, fostering inclusivity and positive discourse in
online spaces. In this paper we make the following contributions: (i) enrich RUHSOLD dataset
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and create a new RUHSOLD++ dataset; (ii) provide a new custom function in Python to
dynamically alter the spelling of selected words within a sentence; and (iii) provide an empirical
analysis of the diferent data augmentation methods. The paper is structured as follows: Section
2 discusses approaches to detect hate speech in the Roman Urdu language. In Section 3 we
provide the methodology to augment our initial dataset. Section 4 provides the analysis and
ifndings by applying diferent methods for data augmentation while conclusions and future
work end the paper in Section 5.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        The issue of abusive speech has been a longstanding focus within the research community.
In earlier investigations, attempts to identify abusive users primarily relied on lexical and
syntactic features extracted from their posts [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, recent advancements in automated
hate speech detection have witnessed substantial growth. The availability of large datasets
has prompted a shift in academic research towards more data-intensive and sophisticated
models, notably leveraging deep learning techniques [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and graph embedding methods [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
Notably, transformer-based language models like BERT[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] have gained immense popularity in
various downstream tasks, proving to be particularly efective in surpassing traditional deep
learning models such as CNN-GRU[9], LSTM[10], etc., for the detection of abusive language [11],
[12]. This evolution highlights the dynamic nature of research in addressing the complexities
of abusive speech detection. Detecting hate and abusive speech in low-resourced languages
presents a formidable challenge, as exemplified in the context of Roman Urdu. The scarcity of
linguistic resources for Roman Urdu spurred the creation of the RUHSOLD dataset by H. Rizwan
et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Comprising 10,012 tweets, this dataset stands out for its dual approach, ofering
both coarse-grained and fine-grained labeling of hate speech instances. In their research,
Rizwan and colleagues not only curated this valuable dataset but also proposed a deep
learningbased architecture specifically tailored for hate speech detection in Roman Urdu. Addressing
the broader landscape of multilingual abusive speech, M. Das et al. [13] conducted an
indepth investigation into the performance of multilingual models across eight distinct Indic
languages. In a noteworthy application, they employed m-BERT[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and MuRIL[14] models on
the RUHSOLD dataset [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to gauge their eficacy in detecting abusive speech. Through a series of
meticulously designed experiments, encompassing various settings, Das and team explored the
nuances of multilingual hate speech detection. Their findings underscored the efectiveness of
model transfers, revealing that transferring knowledge from one language to another enhances
the overall performance of the models. This body of research not only contributes to the
evolving field of hate speech detection but also illuminates the specific challenges associated
with low-resourced languages like Roman Urdu. By providing a robust dataset and proposing
dedicated architectures, these studies lay essential groundwork for future endeavors aimed at
combating hate speech across diverse linguistic landscapes. The insights gained from these
investigations, especially regarding the transferability of models, ofer valuable guidance for
the development of more inclusive and efective hate speech detection systems in multilingual
contexts. In a meticulous analysis, M. M. Khan. et al. [15] delved into the complexities of hate
speech detection in Roman Urdu, manually examining over 90,000 tweets to curate a substantial
corpus of 5,000 Roman Urdu tweets. Their significant contribution extended beyond dataset
creation, as they systematically employed five supervised learning approaches, including a
sophisticated deep learning technique, to rigorously evaluate and compare their efectiveness in
hate speech detection. The results of their comprehensive study revealed that, across two levels
of categorization, logistic regression outperformed all other techniques, opening up a viable
path for robust of hate speech detection in Roman Urdu. Recognizing the challenges posed
by the low resources of Roman Urdu, Azam et al. [16] undertook a proactive exploration of
data augmentation strategies. Leveraging both easy data augmentation and transformer-based
augmentation approaches, they aimed to enhance hate speech detection capabilities in Roman
Urdu. The researchers conducted experiments using existing datasets in Roman Urdu and
baseline models to meticulously assess the impact of augmentation techniques. Their findings
unequivocally demonstrated that the performance of hate speech detection models could indeed
be significantly improved by the strategic application of augmentation techniques to the dataset.
This research not only contributes to the optimization of hate speech detection in low-resourced
languages like Roman Urdu but also highlights the potential of augmentation strategies as
valuable tools in mitigating the impact of resource constraints, providing valuable insights for
the ongoing evolution of hate speech detection methodologies.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Dataset</title>
        <p>
          We employed the RUHSOLD dataset, a comprehensive collection of tweets in Roman Urdu
created by H. Rizwan et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The authors meticulously established a gold standard for
two distinct sub-tasks. Our focus centered on the first sub-task, which involves binary labels
categorizing content as either Hate-Ofensive (labeled as 0) or Normal (labeled as 1), representing
inofensive language. The dataset comprises a total of 10,000 tweets, thoughtfully partitioned
into training, testing, and validation sets in a ratio of 70%, 20%, and 10%, respectively.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data Augmentation</title>
        <p>In our quest to boost our dataset’s size and improve the overall performance, we strategically
employed Noise-based Data Augmentation techniques on training data. We dove into a detailed
exploration, trying out diferent percentages for augmenting the dataset to strike the right
balance. Moreover, we played around with the augmentation process by adjusting the percentage
of words in each tweet that underwent these augmentation techniques. This nuanced approach
was not just about expanding the dataset; it was about delicately adjusting the augmentation
impact and finding the sweet spot between quantity and quality to strengthen our model’s
resilience. Through methodical experimentation, we aimed to uncover the most efective
configurations that could genuinely enhance the overall performance of our model.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Random Swapping</title>
          <p>Random swapping is an efective technique in the realm of noise-based data augmentation. This
technique is based on randomly swapping the words or tokens from a tweet. For example: "hum
kisi se km nhi" becomes "km kisi se hum nhi".</p>
          <p>This operation adds variances to the dataset without changing the overall sentiment or
context of the text. In order to help the model generalize and function well on a variety of
linguistic patterns, it is intended to be exposed to various word configurations.The percentage
of word augmentation in random swapping directly controls the degree of variability injected
into the dataset. Therefore, it is essential to find the optimal percentage of words which should
be swapped during augmentation.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Random Deletion</title>
          <p>Another noise-based data augmentation is Random Deletion. This strategy involves the
deliberate and random removal of words or tokens from a given sentence, introducing an element
of unpredictability and variability. We designate a specific percentage of words within the
sentence for potential removal, aiming to strike a careful balance between introducing noise
for robustness and preserving the coherence of the text. By implementing this intentional
randomness, we infuse the dataset with a dynamic quality, fortifying the model’s adaptability
to diverse linguistic nuances. This method serves as a potent instrument, enriching our model’s
adaptability and eficacy across a wide range of textual inputs.</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Spelling Augmentation</title>
          <p>Spelling Augmentation adds a layer of complexity by altering the spelling of words within
a sentence. This process entails replacing one or more characters in a word with randomly
chosen alternatives and deliberately introducing a controlled amount of noise into the data. The
purpose here is to diversify the linguistic patterns in the dataset, enhancing the model’s ability
to handle variations in spelling and promoting resilience against potential inconsistencies in
user-generated content. This meticulous introduction of noise through character substitution
serves as a strategic maneuver, refining our model’s capacity to adapt to a wide array of spelling
idiosyncrasies. For example: "chal ja tujhy maaf kia" becomes "chal aa tujha maaf kia".</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. The new RUHSOLD++ Dataset</title>
        <p>
          After employing the data augmentation techniques, we created a new RUHSOLD++ Dataset4
which can be accessed publicly to promote future work. The dataset consists of 3 types of data
with swap, delete and spelling augmentation applied. The dataset contains the augmented data
in train and validation sets distributed uniformly and unaltered test data.
3.4. Model
The m-BERT model has garnered significant attention in the realm of abusive speech research.
Its eficacy is attributed to being pre-trained on a comprehensive dataset, comprising the
extensive content of Wikipedia5, employing a masked language modeling (MLM)[17] objective
across 104 languages. This pre-training involves 12 fully connected transformer encoder layers,
incorporating a self-attention mechanism to eficiently process contextual information. It is
worth noting that m-BERT, while powerful, has a token limit of 512, necessitating the use
of a fine-tuned variant introduced by Das, M. et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Das and colleagues enhanced the
original m-BERT by incorporating a fully connected layer, aligning its output with the CLS
(classification) token in the input. This added layer introduces a level of specificity, with the
output reflecting the model’s interpretation of the input sentence, often represented by the CLS
token output. This nuanced modification allows the model to capture and interpret complex
contextual nuances within the given token limit, contributing to its eficacy in understanding
and classifying abusive speech patterns.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimentation and Results</title>
      <p>
        In this section, we describe the experimentation conducted on the RUHSOLD [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] dataset using
ifned tuned m-Bert model, a refinement proposed by M. Das et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In our implementation,
we use the PyTorch library6 in Python, configuring each model to run for 10 epochs with an
Adam optimizer and a batch size of 16.
4https://github.com/fariha231/impact-of-augmentation-ruhsoldplusplus
5https://www.wikipedia.org/
6https://pytorch.org/
      </p>
      <p>The experimentation extended to the exploration of data augmentation techniques, aiming
to strike an optimal balance between computational eficiency and accuracy. The primary focus
was on determining the most suitable percentage of the dataset for applying augmentation. To
achieve this, we selected a portion of the original training dataset that underwent Random Swap
Augmentation. Leveraging the NLPAug7 library in Python, a random percentage of 30% for
words was chosen, indicating that 30% of the total words in a tweet would undergo swapping
with each other. Following the generation of augmented data, it was seamlessly integrated with
the original dataset and subsequently shufled to mitigate overfitting concerns. The outcomes
of this experiment are succinctly presented in Table 1, showcasing the impact of Random Swap
at varying percentages of the dataset. This experimentation highlights the strategic choices
made in augmenting the data to achieve an optimal trade-of between eficiency and accuracy.</p>
      <p>We employed the Macro F1 score (mF1-score) as a performance metric, along with other
evaluation measures. The Macro F1 score allows to assess the performance of each class
individually while giving equal weight to all classes. Examining the results, we observe a
consistent improvement in both accuracy and mF1-score as the model is trained on augmented
data. However, a notable finding emerges: the highest accuracy is attained when augmentation
is applied to 50% of the data. Beyond this threshold, further increasing the size of augmented
data leads to diminishing returns, resulting in a decline in accuracy and mF1-score. This suggests
that the model tends to overfit the training data when subjected to an excessive amount of
augmented information. While the accuracy of the validation data may show promising signs,
the model’s performance on unseen data, specifically the test data, begins to decrease.</p>
      <p>With the optimal augmented data percentage identified, our exploration extends to varying
the percentage of words swapped in each iteration. The results, as depicted in Table 2, indicate
that the overall accuracy and mF1-score exhibit minimal fluctuations with changes in the
word augmentation percentage. However, precision and recall values do showcase variations
corresponding to alterations in the word augmentation percentage. This nuanced observation</p>
      <p>Subsequently, we implemented the Delete Data Augmentation on 50% of the original training
dataset. Leveraging the NLPAug library in Python, we conducted experiments to generate new
data by selectively removing certain words in each row. This augmented dataset was seamlessly
integrated with the original data, efectively amplifying the training set by 50%. Our exploration
further extended to varying the percentage of words designated for deletion in each iteration.
The outcomes of this experiment are presented in Table 3. The results demonstrate that this
approach not only diversifies the training data but also involves fine-tuning the augmentation
to achieve improvements in model performance.
underscores the importance of fine-tuning not only the quantity of augmented data but also the
specific aspects of augmentation.</p>
      <p>Words swapped per instance (%)</p>
      <p>In implementing spelling augmentation, we crafted a custom function in Python to
dynamically alter the spelling of selected words within each sentence. This function provides the
lfexibility to adjust the percentage of words in each row subject to augmentation. The outcomes
of this experiment are presented in Table 4. Notably, the results reveal an improvement in
accuracy as we incrementally raise the percentage of augmented words in each row. However,
a cautious approach was adopted, refraining from further increasing the percentage to prevent
potential distortion of the sentence’s meaning. Beyond a certain threshold, excessive alterations
could compromise the contextual integrity of the sentence, potentially undermining the model’s
overall performance.</p>
      <p>After conducting a comprehensive array of experiments, our findings show that the most
favorable accuracy was achieved through swap data augmentation. The optimal model,
exhibiting the highest accuracy, emerged from training with an additional 50% of data, where 30%
of words in each row were subject to swapping. This configuration demonstrated the finest
balance between data enrichment and model performance enhancement.</p>
      <p>To provide a visual representation of the model’s performance, Figure 1 presents the confusion
matrix for this optimal model, showcasing the details of how well the model navigated and
classified instances with the applied swap data augmentation. This synthesis of experimentation
outcomes reinforces not only the eficacy of swap data augmentation but also the significance
of precise configurations in achieving the model’s peak performance.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>Recognizing the pressing need to combat hate speech within the constraints of limited resources,
our paper lies in the strategic application of data augmentation techniques to linguistic datasets
on social media. We assert that applying data augmentation techniques to the dataset helps
to increase the dataset size and improves the overall model performance. We experimented
with determining the ideal percentage of augmented data to seamlessly integrate with the
original dataset. This exploration aimed not only to enhance model training eficiency but also
to circumvent the pitfalls of model overfitting. Our experimentation involved the application of
three distinct data augmentation techniques: random swap, random deletion, and spelling
augmentation. Notably, the results underscore the prowess of swap data augmentation, exhibiting
the highest accuracy at 91.3%. This achievement was realized with a 30% word augmentation
rate and a 50% augmented data incorporation. We also published the new RUHSOLD++ dataset
containing the augmented data. For future work we envision the exploration of additional
augmentation techniques, setting the stage for a comprehensive comparison of model performances.
This will improve and add more tools to the ongoing fight against hate speech on social media
especially for under-resourced languages such as Roman Urdu.
transformers for language understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.),
Proceedings of the 2019 NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume
1, Association for Computational Linguistics, 2019, pp. 4171–4186. URL: https://doi.org/10.
18653/v1/n19-1423. doi:10.18653/V1/N19-1423.
[9] Z. Zhang, D. Robinson, J. A. Tepper, Detecting hate speech on twitter using a
convolutiongru based deep neural network, in: A. Gangemi, R. Navigli, M. Vidal, P. Hitzler, R. Troncy,
L. Hollink, A. Tordai, M. Alam (Eds.), The Semantic Web - 15th International Conference,
ESWC 2018, Heraklion, Crete, Greece, June 3-7, 2018, Proceedings, volume 10843 of
Lecture Notes in Computer Science, Springer, 2018, pp. 745–760. URL: https://doi.org/10.
1007/978-3-319-93417-4_48. doi:10.1007/978-3-319-93417-4\_48.
[10] G. V. Houdt, C. Mosquera, G. Nápoles, A review on the long short-term memory model,
Artif. Intell. Rev. 53 (2020) 5929–5955. URL: https://doi.org/10.1007/s10462-020-09838-1.
doi:10.1007/S10462-020-09838-1.
[11] H. S. Alatawi, A. Alhothali, K. Moria, Detection of hate speech using BERT and hate
speech word embedding with deep model, CoRR abs/2111.01515 (2021). URL: https:
//arxiv.org/abs/2111.01515. arXiv:2111.01515.
[12] M. Bilal, A. Khan, S. Jan, S. Musa, S. Ali, Roman urdu hate speech detection using
transformer-based model for cyber security applications, Sensors 23 (2023) 3909. URL:
https://doi.org/10.3390/s23083909. doi:10.3390/S23083909.
[13] M. Das, S. Banerjee, A. Mukherjee, Data bootstrapping approaches to improve low resource
abusive language detection for indic languages, in: A. Bellogín, L. Boratto, F. Cena (Eds.),
HT ’22: 33rd ACM Conference on Hypertext and Social Media, Barcelona, Spain, 28 June
2022- 1 July 2022, ACM, 2022, pp. 32–42. URL: https://doi.org/10.1145/3511095.3531277.
doi:10.1145/3511095.3531277.
[14] S. Khanuja, D. Bansal, S. Mehtani, S. Khosla, A. Dey, B. Gopalan, D. K. Margam, P. Aggarwal,
R. T. Nagipogu, S. Dave, S. Gupta, S. C. B. Gali, V. Subramanian, P. P. Talukdar, Muril:
Multilingual representations for indian languages, CoRR abs/2103.10730 (2021). URL:
https://arxiv.org/abs/2103.10730. arXiv:2103.10730.
[15] M. M. Khan, K. Shahzad, M. K. Malik, Hate speech detection in roman urdu, ACM Trans.</p>
      <p>Asian Low Resour. Lang. Inf. Process. 20 (2021) 9:1–9:19. URL: https://doi.org/10.1145/
3414524. doi:10.1145/3414524.
[16] U. Azam, H. Rizwan, A. Karim, Exploring data augmentation strategies for hate speech
detection in roman urdu, in: Proceedings of the Thirteenth Language Resources and
Evaluation Conference, LREC 2022, Marseille, France, 20-25 June 2022, European Language
Resources Association, 2022, pp. 4523–4531. URL: https://aclanthology.org/2022.lrec-1.481.
[17] J. Salazar, D. Liang, T. Q. Nguyen, K. Kirchhof, Masked language model scoring, in:
D. Jurafsky, J. Chai, N. Schluter, J. R. Tetreault (Eds.), Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics, ACL 2020, Online, July
510, 2020, Association for Computational Linguistics, 2020, pp. 2699–2712. URL: https:
//doi.org/10.18653/v1/2020.acl-main.240. doi:10.18653/V1/2020.ACL-MAIN.240.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Fortuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nunes</surname>
          </string-name>
          ,
          <article-title>A survey on automatic detection of hate speech in text</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>51</volume>
          (
          <year>2018</year>
          )
          <volume>85</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>85</lpage>
          :
          <fpage>30</fpage>
          . URL: https://doi.org/10.1145/3232676. doi:
          <volume>10</volume>
          .1145/3232676.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>M. M. Khan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Shahzad</surname>
            ,
            <given-names>M. K.</given-names>
          </string-name>
          <string-name>
            <surname>Malik</surname>
          </string-name>
          ,
          <article-title>Hate speech detection in roman urdu</article-title>
          ,
          <source>ACM Trans. Asian Low Resour. Lang. Inf. Process</source>
          .
          <volume>20</volume>
          (
          <year>2021</year>
          ) 9:
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          :
          <fpage>19</fpage>
          . URL: https://doi.org/10.1145/ 3414524. doi:
          <volume>10</volume>
          .1145/3414524.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Shakeel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Karim</surname>
          </string-name>
          ,
          <article-title>Hate-speech and ofensive language detection in roman urdu</article-title>
          , in: B.
          <string-name>
            <surname>Webber</surname>
            , T. Cohn,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
          </string-name>
          , Y. Liu (Eds.),
          <source>Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP</source>
          <year>2020</year>
          , Online,
          <source>November 16-20</source>
          ,
          <year>2020</year>
          , Association for Computational Linguistics,
          <year>2020</year>
          , pp.
          <fpage>2512</fpage>
          -
          <lpage>2522</lpage>
          . URL: https:// doi.org/10.18653/v1/
          <year>2020</year>
          .emnlp-main.
          <volume>197</volume>
          . doi:
          <volume>10</volume>
          .18653/V1/
          <year>2020</year>
          .EMNLP-MAIN.
          <year>197</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dewani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Memon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhatti</surname>
          </string-name>
          ,
          <article-title>Development of computational linguistic resources for automated detection of textual cyberbullying threats in roman urdu language, 3C TIC: Cuadernos de desarrollo aplicados a las TIC 10 (</article-title>
          <year>2021</year>
          )
          <fpage>101</fpage>
          -
          <lpage>121</lpage>
          . doi:
          <volume>10</volume>
          .17993/3ctic.
          <year>2021</year>
          .
          <volume>102</volume>
          .
          <fpage>101</fpage>
          -
          <lpage>121</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>Detecting ofensive language in social media to protect adolescent online safety</article-title>
          , in: 2012 International Conference on Privacy, Security, Risk and Trust,
          <string-name>
            <surname>PASSAT</surname>
          </string-name>
          <year>2012</year>
          ,
          <article-title>and</article-title>
          2012 International Confernece on Social Computing,
          <source>SocialCom</source>
          <year>2012</year>
          , Amsterdam, Netherlands, September 3-
          <issue>5</issue>
          ,
          <year>2012</year>
          , IEEE Computer Society,
          <year>2012</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>80</lpage>
          . URL: https://doi.org/10.1109/SocialCom-PASSAT.
          <year>2012</year>
          .
          <volume>55</volume>
          . doi:
          <volume>10</volume>
          .1109/SOCIALCOM-PASSAT.
          <year>2012</year>
          .
          <volume>55</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Badjatiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          ,
          <article-title>Deep learning for hate speech detection in tweets</article-title>
          , in: R. Barrett,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cummings</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Agichtein</surname>
          </string-name>
          , E. Gabrilovich (Eds.),
          <source>Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, April 3-7</source>
          ,
          <year>2017</year>
          , ACM,
          <year>2017</year>
          , pp.
          <fpage>759</fpage>
          -
          <lpage>760</lpage>
          . URL: https://doi.org/10.1145/3041021.3054223. doi:
          <volume>10</volume>
          .1145/3041021.3054223.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>M. Das</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Saha</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Dutt</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mukherjee</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mathew</surname>
          </string-name>
          ,
          <article-title>You too brutus! trapping hateful users in social media: Challenges, solutions &amp; insights</article-title>
          , in: O.
          <string-name>
            <surname>Conlan</surname>
          </string-name>
          , E. Herder (Eds.),
          <source>HT '21: 32nd ACM Conference on Hypertext and Social Media, Virtual Event, Ireland, 30 August 2021 - 2 September</source>
          <year>2021</year>
          , ACM,
          <year>2021</year>
          , pp.
          <fpage>79</fpage>
          -
          <lpage>89</lpage>
          . URL: https://doi.org/10. 1145/3465336.3475106. doi:
          <volume>10</volume>
          .1145/3465336.3475106.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <article-title>BERT: pre-training of deep bidirectional</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>