<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>machine learning for real-time hate speech detection in social media</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aigerim</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tolep</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Altayeva</string-name>
          <email>a.altayeva@iitu.edu.kz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aigerim Toktarova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rustam</string-name>
          <email>abdrakhmanov.rustam@iuth.edu.kz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Abdrakhmanov</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>International Information Technology University</institution>
          ,
          <addr-line>34/1 Manas St., Almaty, 050000</addr-line>
          ,
          <country country="KZ">Kazakhstan</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>International University of Tourism and Hospitality</institution>
          ,
          <addr-line>Turkistan</addr-line>
          ,
          <country country="KZ">Kazakhstan</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Khoja Akhmet Yassawi International Kazakh-Turkish University</institution>
          ,
          <addr-line>Turkistan</addr-line>
          ,
          <country country="KZ">Kazakhstan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The pervasive spread of hate speech on social media platforms has necessitated the development of effective detection mechanisms to maintain online civility and safety. This research paper investigates the application of various machine learning algorithms to identify hate speech, employing a diverse array of feature sets including statistical data, Term Frequency-Inverse Document Frequency (TFIDF), and Linguistic Inquiry and Word Count (LIWC). Through a comparative analysis, the study evaluates the performance of six prominent machine learning models-Random Forest, Decision Tree, K-Nearest Neighbors (KNN), Naive Bayes, Logistic Regression, and Support Vector Machines-in terms of accuracy, precision, recall, F-score, and Area Under the Curve (AUC-ROC) metrics. The results demonstrate that models incorporating a combination of advanced linguistic and statistical features significantly outperform those using simpler feature sets, highlighting the critical role of comprehensive feature engineering in the detection process. The study also addresses the ethical implications of automated hate speech detection, emphasizing the need for balanced approaches that consider both the effectiveness of content moderation and the protection of free speech. This research contributes to the field by outlining the strengths and limitations of current methodologies and suggesting pathways for future improvements, including the integration of more sophisticated natural language processing techniques and the continual refinement of ethical standards in model deployment.</p>
      </abstract>
      <kwd-group>
        <kwd>Hate speech detection</kwd>
        <kwd>machine learning</kwd>
        <kwd>social media</kwd>
        <kwd>NLP</kwd>
        <kwd>text processing</kwd>
        <kwd>hate speech</kwd>
        <kwd>machine</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the digital age, social media platforms have become central to our daily communication,
fostering interactions across global communities. However, this connectivity also brings
challenges, notably the proliferation of hate speech, which can incite violence, spread discord, and
cause psychological harm. The rise in online hate speech has prompted urgent calls for effective
monitoring and intervention mechanisms. Machine learning (ML), with its ability to analyze large
volumes of data, offers promising solutions for identifying and mitigating hate speech in real time.</p>
      <p>
        One of the key challenges in developing ML models for hate speech detection is the creation of
robust, diverse datasets that accurately represent the scope of hateful content without bias.
Previous research has highlighted the importance of comprehensive datasets that are annotated
with high accuracy, as the quality of data directly impacts the effectiveness of the detection models
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Furthermore, ensuring that these datasets are representative of different languages, dialects,
and cultural contexts is crucial for the global applicability of the models.
      </p>
      <p>
        Another significant aspect is the ethical considerations surrounding automated monitoring
systems. There is an ongoing debate regarding the balance between freedom of expression and the
need to protect individuals from hate speech. Scholars advocate for transparent, accountable
algorithms to prevent unjust censorship and maintain user trust [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Additionally, the deployment
of ML models in real-time scenarios raises concerns about privacy and data security, necessitating
strict compliance with data protection regulations [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Advancements in deep learning have led to the development of more accurate and efficient
models for text analysis. Neural networks, particularly those utilizing transformer architectures,
have shown great promise in understanding the context and complexity of language used in online
platforms [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. These models, trained on extensive web-crawled data, can detect subtle cues and
variations in text, making them effective for real-time applications in diverse settings.
      </p>
      <p>
        The integration of machine learning in combating hate speech on social media not only
enhances the ability to monitor large volumes of content but also supports moderators in making
informed decisions. By automating the detection process, platforms can respond more swiftly and
consistently to hate speech incidents, potentially reducing the spread and impact of harmful
content [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        The application of machine learning techniques in detecting hate speech is a dynamic and
evolving field that addresses both technical and ethical challenges. As this technology advances,
continuous evaluation and adaptation of these models are essential to ensure they remain effective
across different social media environments and meet the ethical standards required for widespread
deployment [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>The proliferation of hate speech on social media has been met with various machine learning
strategies aimed at its detection and mitigation. Several works have paved the way in addressing
the technical and ethical challenges involved in this area. These efforts span from the development
of algorithms and models to the creation of datasets and ethical frameworks.</p>
      <p>
        The early attempts at hate speech detection primarily utilized classical machine learning
techniques, such as Support Vector Machines (SVM) and Naive Bayes classifiers. These methods
were often coupled with bag-of-words models to classify text as hate speech or non-hate speech
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. While effective to a degree, these approaches lacked the ability to understand context and the
subtleties of language, which are crucial in accurately identifying hate speech.
      </p>
      <p>
        Recent advancements have shifted focus towards deep learning techniques, which offer superior
performance in text analysis due to their ability to capture hierarchical representations of data.
Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been widely
adopted for their proficiency in processing sequential data, making them particularly suited for
handling the complexities of natural language [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. For instance, Long Short-Term Memory (LSTM)
networks, a variant of RNNs, have been used extensively due to their capability to remember
longterm dependencies in text [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        In summary, the body of work on machine learning for hate speech detection is extensive and
multi-faceted. It spans from technical advancements in model architecture and dataset development
to ethical considerations and real-world applicability [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. As social media continues to evolve, so
too must the strategies employed to combat hate speech, ensuring they are robust, ethical, and
adaptable to new challenges [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. The ongoing research is crucial in shaping the future of safe and
inclusive online environments [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and methods</title>
      <p>This section outlines the methodological framework employed in the development of a machine
learning pipeline for real-time detection of hate speech on social media platforms. The process is
visualized in Figure 1, which presents a structured flowchart delineating each step from data
collection to the application of the classifier models. The systematic approach ensures a robust
analysis of textual data, leveraging advanced machine learning techniques to identify and classify
hate speech effectively.</p>
      <p>The prototype database for the specified system was created by an analysis of 215
Englishlanguage Twitter accounts, comprising a total of 200,000 tweets, with more than 4,000 tweets
undergoing thorough investigation. Analysis identified 583 English-language tweets displaying
traits of the harmful strategy termed “cyberbullying.” Electronic verbal bullying was primarily
noted in posts by adolescents aged 11-17 and young adults aged 18-35. Adolescent cyberbullying
generally involved groups, whereas electronic bullying among teenagers adhered to a "one bully –
one victim" paradigm.</p>
      <sec id="sec-3-1">
        <title>3.1. Problem statement</title>
        <p>The primary objective of this research is to develop a machine learning-based system capable of
accurately detecting hate speech in real-time on social media platforms. Hate speech, for the
purpose of this study, is defined as any communication that disparages a person or a group on the
basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality,
religion, or other characteristics.</p>
        <p>Given a dataset D containing textual entries  each labeled as hate speech yi=1 or not hate
speech yi=0 , the goal is to train a classifier that predicts the label y^i of a new, unseen text xi
based on learned patterns from D .</p>
        <p>f ( xi )= y^i
(1)</p>
        <sec id="sec-3-1-1">
          <title>Where</title>
          <p>xi is a feature vector extracted from the text.
yi is the predicted label, where y^i ∈ {0,1}.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Proposed method</title>
        <p>Figure 1 illustrates the comprehensive workflow employed in the machine learning pipeline for
hate speech detection on social media platforms. The process begins with the collection of an
annotated hate speech dataset from various platforms. This dataset consists of textual data that has
been manually labeled as 'hate' or 'not hate,' providing a foundation for training and validating the
machine learning models. The diversity of the platforms ensures that the dataset encompasses a
wide range of linguistic expressions and contexts, thereby enhancing the robustness of the
detection system. Following the dataset compilation, the data undergoes a series of preprocessing
steps. These steps are crucial for cleaning and normalizing the data, which include removing noise
such as irrelevant symbols, correcting typos, and standardizing text format. This preprocessing
phase is essential to reduce the complexity of the text and to enhance the performance of the
subsequent machine learning algorithms by focusing on the relevant features of the data.</p>
        <p>The preprocessed data is then converted into a numerical format through a "word to vector"
conversion process. This transformation is pivotal as it turns the raw text into a structured form
that machine learning algorithms can interpret. Feature extraction follows, where significant
attributes or features from the text are identified and extracted. These features could include word
frequency, presence of specific terms, and other linguistic markers indicative of hate speech. The
data is then split into training and validation datasets, which are used to train the models and tune
their parameters, respectively. The trained models, including Support Vector Machine (SVM),
Naive Bayes (NB), Decision Tree (DT), and Random Forest (RF), are finally applied to a separate
testing dataset to evaluate their effectiveness in classifying and predicting hate speech accurately.
The output categorizes the text into 'hate' or 'not hate,' providing a tool for automated moderation
on social media platforms.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment results</title>
      <p>Figure 2 provides a comprehensive comparison of various machine learning algorithms in terms of
their performance metrics on the task of hate speech detection. The algorithms tested include
Logistic Regression, Naive Bayes, Support Vector Machine (SVM), Decision Trees, K-Nearest
Neighbors (KNN), and Random Forests, across five key metrics: accuracy, precision, recall,
F1score, and ROC AUC score. The results are presented in a bar graph format, allowing for a clear
visual comparison of each algorithm's effectiveness in identifying and classifying hate speech.</p>
      <p>The performance of each algorithm varies significantly, highlighting their strengths and
weaknesses in different aspects of hate speech detection. Logistic Regression, SVM, and Random
Forest show strong performance across all metrics, suggesting their robustness and suitability for
this application. In contrast, algorithms like Naive Bayes and Decision Trees display lower
performance in certain metrics, indicating potential limitations in their ability to handle the
complex and nuanced nature of hate speech text data. The ROC AUC scores are particularly
important as they provide insight into the models' ability to discriminate between the classes under
varying threshold settings, essential for tuning the models in practical applications where the cost
of false positives and false negatives can vary.</p>
      <p>The AUC values provide a measure of the model's ability to distinguish between the classes
(hate speech and not hate speech). A higher AUC value indicates better model performance. From
the figure, SVM shows the highest AUC at 0.87, indicating excellent model performance with a
strong capability to discriminate between the classes. Logistic Regression also performs well, with
an AUC of 0.81, followed by Random Forest and KNN, both at 0.83 and 0.79 respectively. Decision
Tree models demonstrate moderate discriminative power with an AUC of 0.79. In contrast, Naive
Bayes exhibits significantly lower performance with an AUC of 0.51, suggesting it struggles to
effectively differentiate between hate speech and non-hate speech within the tested dataset. This
visualization highlights the varying effectiveness of each algorithm in handling the nuances of hate
speech detection, guiding the selection of the most appropriate model based on the specific
requirements and constraints of the application.</p>
      <p>The comparative analysis presented in Table 1 elucidates the performance of various machine
learning models in the context of hate speech detection, utilizing different feature sets such as
statistical features, TFIDF (Term Frequency-Inverse Document Frequency), and LIWC (Linguistic
Inquiry and Word Count). Decision Tree and K-Nearest Neighbors (KNN) models, which
incorporate both statistical and TFIDF features, along with LIWC for KNN, demonstrate the highest
overall performance among the evaluated models. Specifically, KNN achieves a marginally better
balance across all metrics, with an accuracy, precision, and recall around 0.5989, 0.5991, and 0.5981,
respectively, and a nearly similar F-score and AUC-ROC. This suggests that incorporating a
broader range of linguistic and statistical features can enhance the model's ability to detect hate
speech effectively. In contrast, models utilizing only statistical features, such as Random Forest and
Naïve Bayes, show comparatively lower performance metrics, highlighting the significance of
feature selection in improving the detection capabilities of machine learning algorithms in complex
tasks like hate speech detection. The results underscore the importance of tailored feature
engineering and the potential impact of integrating comprehensive linguistic analyses to refine the
precision and reliability of hate speech classification systems.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>The findings from this study contribute to the growing body of research on the application of
machine learning techniques in detecting hate speech on social media platforms. The results
underscore the importance of selecting appropriate feature sets and machine learning models to
enhance detection accuracy and efficiency. Particularly, the superior performance of models
incorporating advanced feature sets such as TFIDF and LIWC suggests the critical role of
sophisticated linguistic analysis in understanding and identifying hate speech effectively.</p>
      <p>
        The decision tree and KNN models, which included a combination of statistical, TFIDF, and
LIWC features, outperformed other models, indicating that the integration of comprehensive
linguistic and statistical indicators can significantly improve the model’s ability to classify and
predict hate speech. This finding aligns with previous studies which emphasized that the quality
and diversity of features are decisive factors in the performance of machine learning algorithms in
text classification tasks [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. The results also highlight the limitations of using simplistic feature
sets, as seen in the relatively lower performance metrics of the Naïve Bayes and Logistic
Regression models that relied solely on statistical features.
      </p>
      <p>
        Furthermore, the use of ensemble methods like Random Forest did not result in the highest
performance despite their known robustness in various classification tasks. This may be attributed
to the complex and nuanced nature of language used in hate speech, which requires more than just
statistical generalizations but a deep semantic understanding that ensemble methods might not
capture effectively [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. This observation is crucial for future research, which should explore
deeper linguistic and contextual analysis methods, potentially through the integration of natural
language processing (NLP) techniques like sentiment analysis and context-aware processing [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ].
      </p>
      <p>
        The ethical considerations of employing machine learning for hate speech detection also
warrant discussion. The trade-offs between effectively moderating content and preserving freedom
of speech are complex and multifaceted. While machine learning offers significant advantages in
automating the detection of hate speech, it also poses risks such as biases in the training data
leading to unfair censorship [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. Studies have highlighted the necessity for transparent and
accountable machine learning models to mitigate these risks [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Moreover, ongoing monitoring
and updating of models are essential to adapt to the evolving nature of language and hate speech
tactics [
        <xref ref-type="bibr" rid="ref27">27-28</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This research has systematically explored the application of various machine learning algorithms
for the detection of hate speech on social media, revealing critical insights into the performance
and limitations of these models. By integrating diverse feature sets, including statistical, TFIDF, and
LIWC, the study demonstrated that models like Decision Tree and KNN, which employ a
combination of linguistic and statistical features, significantly outperform those relying solely on
basic features. This underscores the importance of sophisticated feature engineering in enhancing
the detection capabilities of algorithms in the nuanced realm of hate speech. Moreover, the findings
emphasize the necessity of ongoing model refinement and the integration of advanced natural
language processing techniques to better capture the context and complexity of language used in
hate speech. Ethical considerations also emerged as a pivotal aspect of deploying machine learning
solutions, highlighting the delicate balance between effective moderation and the preservation of
freedom of speech. The challenges of bias and fairness in algorithmic decisions call for transparent,
accountable practices in machine learning deployments. Future research should thus not only focus
on improving the technical accuracy of detection models but also on developing ethical
frameworks that govern their application, ensuring that they remain adaptable and sensitive to the
dynamic landscape of social media communication. This study contributes to the broader discourse
on leveraging technology to create safer online environments, while also respecting user rights and
fostering positive digital interactions.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>This work was supported by the research project ― Automatic detection of cyberbullying among
young people in social networks using artificial intelligence funded by the Ministry of Science and
Higher Education of the Republic of Kazakhstan. Grant No. IRN AP23488900.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <sec id="sec-8-1">
        <title>The authors have not employed any Generative AI tools.</title>
        <p>[28] Batani, J., Mbunge, E., Muchemwa, B., Gaobotse, G., Gurajena, C., Fashoto, S., ... &amp; Dandajena,
K. (2022, April). A review of deep learning models for detecting cyberbullying on social media
networks. In Computer Science On-line Conference (pp. 528-550). Cham: Springer
International Publishing.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Miran</surname>
            ,
            <given-names>A. Z.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yahia</surname>
            ,
            <given-names>H. S.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Hate Speech Detection in Social Media (Twitter) Using Neural Network</article-title>
          .
          <source>J. Mobile Multimedia</source>
          ,
          <volume>19</volume>
          (
          <issue>3</issue>
          ),
          <fpage>765</fpage>
          -
          <lpage>798</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Gudumotu</surname>
            ,
            <given-names>C. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nukala</surname>
            ,
            <given-names>S. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reddy</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konduri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Gireesh</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>A Survey on Deep Learning Models to Detect Hate Speech and Bullying in Social Media</article-title>
          .
          <source>In Artificial Intelligence for Societal Issues</source>
          (pp.
          <fpage>27</fpage>
          -
          <lpage>44</lpage>
          ). Cham: Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Sai</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srivastava</surname>
            ,
            <given-names>N. D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sharma</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>Explorative application of fusion techniques for multimodal hate speech detection</article-title>
          .
          <source>SN Computer Science</source>
          ,
          <volume>3</volume>
          (
          <issue>2</issue>
          ),
          <fpage>122</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Sultan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toktarova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhumadillayeva</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aldeshov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mussiraliyeva</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beissenova</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Imanbayeva</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Cyberbullying-related hate speech detection using shallow-todeep learning</article-title>
          .
          <source>Computers, Materials &amp; Continua</source>
          ,
          <volume>74</volume>
          (
          <issue>1</issue>
          ),
          <fpage>2115</fpage>
          -
          <lpage>2131</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Simon</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baha</surname>
            ,
            <given-names>B. Y.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Garba</surname>
            ,
            <given-names>E. J.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>Trends in machine learning on automatic detection of hate speech on social media platforms: A systematic review</article-title>
          .
          <source>FUW Trends in Science &amp; Technology Journal</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ),
          <fpage>001</fpage>
          -
          <lpage>016</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Toktarova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sultan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Azhibekova</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          (
          <year>2024</year>
          , May).
          <source>Review of Machine Learning Models in Cyberbullying Detection Problem. In 2024 IEEE 4th International Conference on Smart Information Systems and Technologies (SIST) </source>
          (pp.
          <fpage>233</fpage>
          -
          <lpage>238</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>del Valle-Cano</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Quijano-Sánchez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liberatore</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Gómez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>SocialHaterBERT: A dichotomous approach for automatically detecting hate speech on Twitter through textual analysis and user profiles</article-title>
          .
          <source>Expert Systems with Applications</source>
          ,
          <volume>216</volume>
          ,
          <fpage>119446</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Al-onazi</surname>
            ,
            <given-names>B. B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alzahrani</surname>
            ,
            <given-names>J. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alotaibi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alshahrani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elfaki</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marzouk</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Motwakel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Chaotic Elephant Herd Optimization with Machine Learning for Arabic Hate Speech Detection</article-title>
          .
          <source>Intelligent Automation &amp; Soft Computing</source>
          ,
          <volume>39</volume>
          (
          <issue>3</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Mazari</surname>
            ,
            <given-names>A. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boudoukhani</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Djeffal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>BERT-based ensemble learning for multiaspect hate speech detection</article-title>
          .
          <source>Cluster Computing</source>
          ,
          <volume>27</volume>
          (
          <issue>1</issue>
          ),
          <fpage>325</fpage>
          -
          <lpage>339</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Makhanova</surname>
          </string-name>
          ,
          <string-name>
            <surname>Zlikha</surname>
          </string-name>
          , et al.
          <article-title>"A Deep Residual Network Designed for Detecting Cracks in Buildings of Historical Significance." </article-title>
          <source>International Journal of Advanced Computer Science &amp; Applications 15.5</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Mohamed</surname>
            ,
            <given-names>M. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elzayady</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Badran</surname>
            ,
            <given-names>K. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Salama</surname>
            ,
            <given-names>G. I.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>An efficient approach for data-imbalanced hate speech detection in Arabic social media</article-title>
          .
          <source>Journal of Intelligent &amp; Fuzzy Systems</source>
          ,
          <volume>45</volume>
          (
          <issue>4</issue>
          ),
          <fpage>6381</fpage>
          -
          <lpage>6390</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Khullar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nkemelu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>V. C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Best</surname>
            ,
            <given-names>M. L.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Hate Speech Detection in Limited Data Contexts using Synthetic Data Generation</article-title>
          .
          <source>ACM Journal on Computing and Sustainable Societies</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Paul</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bora</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Detecting hate speech using deep learning techniques</article-title>
          .
          <source>International Journal of Advanced Computer Science and Applications</source>
          ,
          <volume>12</volume>
          (
          <issue>2</issue>
          ),
          <fpage>619</fpage>
          -
          <lpage>623</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Akhter</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Acharjee</surname>
            ,
            <given-names>U. K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Talukder</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Islam</surname>
            ,
            <given-names>M. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Uddin</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>A robust hybrid machine learning model for Bengali cyber bullying detection in social media</article-title>
          .
          <source>Natural Language Processing Journal</source>
          ,
          <volume>4</volume>
          ,
          <fpage>100027</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Gandhi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahir</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adhvaryu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lohiya</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cambria</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Hate speech detection: A comprehensive review of recent works</article-title>
          .
          <source>Expert Systems</source>
          ,
          <year>e13562</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <article-title>Plaza-del-</article-title>
          <string-name>
            <surname>Arco</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Molina-González</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Urena-López</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Martín-Valdivia</surname>
            ,
            <given-names>M. T.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Comparing pre-trained language models for Spanish hate speech detection</article-title>
          .
          <source>Expert Systems with Applications</source>
          ,
          <volume>166</volume>
          ,
          <fpage>114120</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Musleh</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rahman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alkherallah</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Al-Bohassan</surname>
            ,
            <given-names>M. K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alawami</surname>
            ,
            <given-names>M. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alsebaa</surname>
            ,
            <given-names>H. A.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Alhaidari</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>A Machine Learning Approach to Cyberbullying Detection in Arabic Tweets</article-title>
          . Computers, Materials &amp; Continua,
          <volume>80</volume>
          (
          <issue>1</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Sasikumar</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nambiar</surname>
            ,
            <given-names>R. K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rohith</surname>
            ,
            <given-names>K. P.</given-names>
          </string-name>
          (
          <year>2023</year>
          ,
          <article-title>July)</article-title>
          .
          <article-title>Unmasking Cyberbullies on Social Media Platforms Using Machine Learning</article-title>
          .
          <source>In 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)</source>
          (pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Nitya</given-names>
            <surname>Harshitha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Prabu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Suganya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Sountharrajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Bavirisetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. P.</given-names>
            ,
            <surname>Gadde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            , &amp;
            <surname>Uppu</surname>
          </string-name>
          ,
          <string-name>
            <surname>L. S.</surname>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>ProTect: a hybrid deep learning model for proactive detection of cyberbullying on social media</article-title>
          .
          <source>Frontiers in artificial intelligence</source>
          ,
          <volume>7</volume>
          ,
          <fpage>1269366</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Paul</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            <given-names>Chatterjee</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Misra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Rana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Gain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            , ... &amp;
            <surname>Sil</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>A survey and comparative study on negative sentiment analysis in social media data</article-title>
          .
          <source>Multimedia Tools and Applications</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Maity</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poornash</surname>
            ,
            <given-names>A. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhattacharya</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Phosit</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kongsamlit</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saha</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pasupa</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>HateThaiSent: Sentiment-Aided Hate Speech Detection in Thai Language</article-title>
          .
          <source>IEEE Transactions on Computational Social Systems.</source>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Al-Hassan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Al-Dossari</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>Detection of hate speech in Arabic tweets using deep learning</article-title>
          .
          <source>Multimedia systems</source>
          ,
          <volume>28</volume>
          (
          <issue>6</issue>
          ),
          <fpage>1963</fpage>
          -
          <lpage>1974</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bhat</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>A study of machine learning-based models for detection, control, and mitigation of cyberbullying in online social media</article-title>
          .
          <source>International Journal of Information Security</source>
          ,
          <volume>21</volume>
          (
          <issue>6</issue>
          ),
          <fpage>1409</fpage>
          -
          <lpage>1431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Khanduja</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chauhan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Telugu Language Hate Speech Detection using Deep Learning Transformer Models: Corpus Generation and Evaluation</article-title>
          .
          <source>Systems and Soft Computing</source>
          ,
          <volume>200112</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Kavitha</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anchitaalagammai</surname>
            ,
            <given-names>J. V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murali</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deepalakshmi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Himal</surname>
            ,
            <given-names>L. R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Suryakanth</surname>
            ,
            <given-names>M. S.</given-names>
          </string-name>
          (
          <year>2023</year>
          , December).
          <article-title>Smart Language Checker: A Machine Learning Solution for Offensive Language detection in Social Media</article-title>
          .
          <source>In 2023 International Conference on Data Science, Agents &amp; Artificial Intelligence (ICDSAAI)</source>
          (pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Najafi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Varol</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Turkishbertweet: Fast and reliable large language model for social media analysis</article-title>
          .
          <source>Expert Systems with Applications</source>
          ,
          <volume>255</volume>
          ,
          <fpage>124737</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Hermida</surname>
            ,
            <given-names>P. C. D. Q.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Santos</surname>
            ,
            <given-names>E. M. D.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Detecting hate speech in memes: a review</article-title>
          .
          <source>Artificial Intelligence Review</source>
          ,
          <volume>56</volume>
          (
          <issue>11</issue>
          ),
          <fpage>12833</fpage>
          -
          <lpage>12851</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>