.

                         Enhancing Hope Speech Detection on Twitter Using
                         Machine Learning and Transformer Models
                         Lemlem Eyob1,* , Tsadkan Yitbarek2 , Amna Naseeb1 , Grigori Sidorov1 and Ildar Batyrshin1
                         1
                             Instituto Politécnico Nacional (IPN), Centro de Investigación en Computación (CIC), Mexico City, Mexico
                         2
                             Maharishi International University, Fairfield, Iowa


                                         Abstract
                                         Hope is a positive mood rooted in the expectation of favorable results in one’s life or the world in general, and
                                         it is both expressed in the present and the future. We make use of traditional machine learning models and
                                         transformer algorithms such as Support Vector Machine (SVM), Random Forest (RF), and a transformer-based
                                         BERT model for hope speech detection using an English dataset for binary hope speech detection collected from
                                         Twitter, which is provided by HOPE at the IberLEF 2024 share task organizers. Our experiment using the BERT
                                         model achieved a macro-average F1-score of 0.85 in the binary classification task, and when compared to the
                                         above-mentioned machine learning models, it consistently outperforms them. This study provides valuable
                                         insights into addressing hope speech and explores the effectiveness of advanced NLP techniques in promoting
                                         positive communication online.

                                         Keywords
                                         Hope, Not Hope, BERT model, Machine learning


                         1. Introduction
                         Hope Speech detection is the process of identifying and detecting inspirational talks, comments, and
                         posts filled with positive vibes [1]. Today, social media platforms online are greatly affecting human
                         life, and people can freely express their thoughts on these social networks [2, 3].
                            Many studies have been conducted to monitor the spread of negativity in modern times by removing
                         vulgar, offensive [4], hatespeech [5] and threatening comments from social media. Nevertheless, there
                         are fewer studies that concentrate on the fact that positivity is important, promoting the fostering of
                         supportive and reassuring content in online forums [6].
                            NLP researchers are deeply involved in exploring a wide array of linguistic areas due to the exponential
                         growth of online data. Their investigations encompass sentiment analysis [7], hate speech detection,
                         language identification [8], fake news identification [9], recognition of positive emotions [10], and
                         more.
                            These efforts are directed at deciphering human expression, understanding text sentiment, recognizing
                         offensive language, determining language origins, distinguishing between authentic and deceptive
                         content, and spotting instances of positivity. Additionally, researchers delve into the critical skill of
                         paraphrasing, essential for tasks like summarizing text and translating between languages. In summary,
                         NLP researchers dedicate themselves to unraveling the complexities of language, driving advancements
                         that facilitate improved communication and understanding across various fields and applications.
                            The main point of the speech is to motivate people who are depressed, lonely, and stressed by the
                         promise, assurance, tips, and help. So, to sum up, the analysis of hope in social media is a necessary tool
                         that can give information about the direction of the goal-directed behaviors that are vital for well-being
                         and that can provide a lot of new and valuable insights into it [1]. Hence, there is a requirement to
                         identify the hope speeches among the social media [6].

                          IberLEF 2024, September 2024, Valladolid, Spain
                         *
                           Corresponding author.
                          $ lkawo2023@cic.ipn.mx (L. Eyob); tabebe@miu.edu (T. Yitbarek); nasseba23@cic.ipn.mx (A. Naseeb); sidorov@cic.ipn.mx
                          (G. Sidorov); batyr1@cic.ipn.mx (I. Batyrshin)
                                      © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Related Work
Nowadays, the digital world has fundamentally changed the way people network and socialize [11].
Quite a lot of investigation has been done on the detection of fake news [12] and hate speech [7, 13]
from social media data. Lately, NLP researchers are directing their focus to the automatic detection of
hope-speech. Examining hope on social media is now considered a must for comprehending well-being
and the path toward goal-directed behaviors. Among the techniques and models used for the detection
of hope speech are the following.
   Firstly, the authors of [14] investigated the impact of psycholinguistic and linguistic features on
hope speech detection using a non-complex deep learning algorithm [15]. Additionally, the MUCS
team presented three proposed models for the "Hope Speech Detection for Equality, Diversity, and
Inclusion-EACL 2021" task, including CoHope-ML, a machine learning voting classifier, CoHope-NN, a
deep learning neural network model, and CoHope-TL, a transfer learning-based model [1].
   Moreover, [16] describes a study involving a curated analysis that introduced a two-level dataset
for hope speech detection in English tweets, marking the first-ever attempt to address hope speech
detection with the actual concept of hope as a multiclass classification task. Another approach proposed
in [2] utilized the SMOTE technique to resolve data imbalance issues and a 1D Conv-LSTM model for
classification.
   Furthermore, various machine learning and deep learning approaches were utilized in the hope
speech detection shared task at EACL 2021 [17]. Authors in [6] proposed the creation of an English-
Kannada Hope speech dataset, KanHope, and in the study [18] present hope speech detection among
posts in English and Spanish using support vector machine (SVM). while [19] presented manually
annotated datasets for hope speech detection in English, Tamil, and Malayalam. They experimented
with multiple machine learning models, including support vector machine (SVM), logistic regression,
K-nearest neighbor, decision tree, and logistic neighbors, and proposed a new CNN -based model, which
outperformed others with impressive macro F1-scores for each language. Finally, a transformer-based
pre-trained BERT model with a rule-based language identification system was described in [20, 21] for
detecting hope speech in YouTube comments.


3. Contributions
    • Our work underscores the efficiency and potential of transformer models, particularly BERT, in
      fostering positive online communication by accurately recognizing hope speech.
    • We advance the current state of research by demonstrating how transformer models like BERT
      can improve the quality of online interactions.
    • We highlight the promising approach that sets the stage for future studies and practical applica-
      tions aimed at enhancing online communication.
    • By leveraging the advanced capabilities of BERT and similar models, we contribute to creating a
      more supportive and positive online environment.
    • Our research aims to enhance the overall user experience and promote positive social discourse
      through the effective use of transformer models.


4. Methodology
We started our experiments with machine learning (ML) algorithms like Support Vector Machines
(SVM), Random Forest. However, Transformer based BERT model gave superior results to the above
mentioned techniques. We implemented the ML algorithms using scikit-learn We use-IDF vectorization
Technique to transform the text data into TF-IDF vectors before feeding it into a machine learning
model. This processed data can then be used for further analysis or training machine learning models.
Table 1
Distribution of labels for training and evaluation data
                                                      Hope   Not Hope
                                         Training     3104   3088
                                        Evaluation    530    502 tables


4.1. Share Task Discription
Hope is the set of openness of spirit towards the future that is a desire, expectation and wish for
something to happen or to be true that is very important in a human’s state of mind, emotions, behavior,
and decision-making [16, 22].
  The shared task on Hope Speech, which is “Task 2: Hope as Expectations,” has two approaches:

    • Subtask 2.a: Binary Hope Speech Detection from English and Spanish texts
    • Subtask 2.b: Multiclass Hope Speech Detection from English and Spanish texts.

   We work specifically on tasks of Binary Hope Speech Detection from English texts. Considering
the dataset that included English tweets, the system should be able to recognize its class. Based on
the training data, our team will categorize the text into ’Hope speech’ and ’Not Hope speech’. We are
working on this shared task of Binary Hope Speech Detection for the English dataset. The assessment
is done on the basis of Precision, Recall, and F1 scores.
   ·     Hope: tweets that display a mention of hope.
   ·     Not Hope: tweets that cannot be described as having hope, expectation, or desire.
   They also offer training, validation, and test datasets, including the golden test dataset to experiment
more.

4.2. Dataset Description
This paper used the corpus from [16] provided by the “HOPE at IberLEF 2024” organizers [23], [24] to
train and tune the models. The dataset encompassed English and Spanish tweets originating from the
first half of 2022, amounting to an aggregate of approximately 100,000 tweets per language.
   However, in our submission, we are attempting this shared task of binary hope speech detection for
the English tweets dataset [16]. The comments in each data set have been labeled as either ’Hope’ if
they contain hope speech or, if not, as ’Not hope ’. When a comment is given to the proposed system, it
will be classified into one of these classes [17]. The distribution of labels for training and evaluation data
is shown in Table 1, and Figure 1 and 2 also show the training and evaluation dataset label distribution.

4.3. Data Preprocessing
Since the comments in raw format are highly unstructured, containing irrelevant information that may
cause any AI-based model to malfunction [25]. The dataset was cleaned up and pre-processed before
model implementation. The primary tool for preprocessing is the ’re’ module from Python’s standard
library, utilized for working with regular expressions to perform text manipulation tasks. The following
steps were employed:-
   1. Remove URLs: In this step, any sequence of characters starting with "HTTP" (HTTP), followed by
any non-whitespace characters (§+), is replaced with an empty string (”), effectively removing URLs
from the text.
   2. Remove numbers: This step removes all numbers from the text.
   3. Remove special characters: This step removes all special characters from the text except whitespace.
   4. Remove emojis: This part removes emojis by first encoding the text into ASCII using encode
(ASCII, ’ignore’). Emojis are non-ASCII characters, so this effectively removes them. Then, it decodes
the text back to Unicode using decode(’ascii’).
Figure 1: Training dataset label distribution         Figure 2: Evaluation dataset label distribution


   5. Convert text to lowercase: Finally, the lower() method is used to convert all characters in the text
to lowercase.
   So, in summary, this function takes a piece of text as input and performs several preprocessing steps
to clean it up, including removing URLs, numbers, special characters, and emojis, and converting the
text to lowercase. This cleaning process is essential to improving the quality of the data being used.
Figure 3 and 4 below show samples of the dataset before and after preprocessing.


Figure 3: Training dataset before preprocessing       Figure 4: Training dataset after preprocessing


4.4. Machine Learning Models
We implemented traditional machine learning algorithms, including Random Forest (RF) and Support
Vector Machine (SVM), to classify text data for binary hope speech detection.

   Random Forest (RF)
Random Forest is an ensemble learning method used for classification and regression tasks, operating
by aggregating the results of multiple individual decision trees. In our experiments, the Random Forest
classifier demonstrated moderate performance. The model achieved an overall accuracy of 51% and
a macro-average F1-score of 0.51. These metrics indicate that while the model is somewhat effective,
there is considerable room for improvement.
   Support Vector Machine (SVM)
Support Vector Machine is a robust classification algorithm that finds the optimal boundary to separate
different classes in the data, maximizing the margin between them. It is particularly effective in high-
dimensional spaces, making it suitable for text classification tasks. The SVM model in our study achieved
an accuracy of 50% and a macro-average F1-score of 0.50 on the English dataset for binary hope speech
detection. This performance is slightly lower than the Random Forest model, further optimization and
fine-tuning of the model parameters may be necessary to improve its performance.

4.4.1. Experimental Setups for Machine Learning Models
Feature engineering is a critical step that transforms raw text data into meaningful numerical represen-
tations, enabling machine learning models to learn and make accurate predictions. Techniques like
TF-IDF, word embeddings, and feature selection help capture the essence of the text while scaling and
normalization ensure numerical stability. For advanced models, handling text sequences appropriately
is essential. By carefully engineering features, you set a solid foundation for training effective ma-
chine learning models. For both of the above mentioned ML models we implemented, We use TF-IDF
vectorization for feature extraction.

  • Model Training
Two different models are trained:

    • Random Forest Classifier: Trained with 200 decision trees on the training data.
    • Support Vector Classifier: Trained with a linear kernel on the TF-IDF transformed training
      data.

   • Text Data
Each entry is extensively annotated for Not Hope or Hope and we provided with separate training,
tasting and validation sets. Both experiments assume X_train and y_train are preprocessed text data
and corresponding labels for training, and dftest[’text’] is the text data for testing.

4.5. BERT
Bidirectional Encoder Representations from Transformers (BERT) [26]. BERT model effectively elimi-
nates the number of parameters without affecting performance. The model was chosen because of its
outstanding outcomes on other tasks [27]. In this experiment also demonstrated better compared to
other traditional Machine Learning models scores with an F1-score of 0.85. Table 2 shows the Parameter
values for BERT.

4.5.1. Experimental Setups for BERT
Word embeddings capture the semantic meaning and context of words by representing them as dense
vectors in a continuous vector space. BERT generates embeddings using bidirectional context, i.e.,
analyzes context from both the left and right of a word. Also, BERT’s attention architecture computes
the attention parallelly for the whole input at once.
   We implement and fine-tune a BERT model for binary classification using the
‘ktrain‘ library.      It involves importing necessary libraries, initializing a BERT model
(google/bert_uncased_L-12_H-768_A-12), and setting up the transformer with a maxi-
mum input length of 400 tokens. The training and validation data are preprocessed to fit the BERT
model requirements. A classifier is then created, and a ‘Learner‘ object is initialized to facilitate training
with a batch size of 12 and a number of epochs of 3. Finally, a ‘Predictor‘ object is created to make
predictions on the test set, leveraging the pre-trained BERT model’s knowledge and fine-tuning it for
the specific dataset.
   Table 2
   Parameters values
                                        Parameter          Value
                                        vocab_size          3000
                                        embedding_dim       100
                                        max_length          200
                                        padding_type       ’post’
                                        trunc_type         ’post’
                                        num_epochs           10


Figure 5: Comparison of the runs submitted


5. Results and Discussion
For this approach, we used models, namely support vector machines (SVM), Random Forest (RF), and
a transformer-based BERT mode for English. Out of all these models, the BERT model produced the
best results out of all the models. A pre-trained BERT model proved to be the best by yielding a
macro-average F1-score of 0.85. The results for all models are shown in Table 3.

   Table 3
   Evaluation metrics
                 Model                       Precision   Recall     F1-Score   Accuracy
                 BERT                          0.85       0.85        0.85       0.85
                 Random Forest                 0.51       0.51        0.51       0.51
                 Support Vector Machine        0.50       0.50        0.50       0.50

   The line graph 5 shows the results of the participants. The red dashed line marks the threshold
of 0.85 for comparison. The graph compares the performance results of various participants, with a
specific threshold set at 0.85. The result of the BERT model in our experiment, which is exactly 0.85, is
highlighted against the results of other participants. The graph shows that the result in our experiment
is above average and meets the threshold, placing us in the upper range of the performance spectrum
compared to others. This indicates a strong performance relative to the group.
6. Conclusion
In our study, we tackled the complex task of detecting hope speech using advanced machine learning
techniques, with a particular focus on the BERT Transformer approach applied to an English dataset
provided by the HOPE at IberLEF 2024 shared task organizers. We compared the performance of
traditional machine learning methods, such as Support Vector Machine (SVM) and Random Forest,
against the state-of-the-art BERT model. Our findings revealed that the BERT model significantly
outperformed these traditional methods, achieving an impressive F1-score of 0.85, thus demonstrating
its superior natural language processing capabilities in identifying hope speech.
   Our experiments also highlighted the importance of TF-IDF vectorization in the preprocessing of text
data, which was crucial for the effectiveness of the machine learning models. We found that the TF-IDF
vectorization technique provided a robust foundation for feature extraction, enabling the models to
better understand and classify the text data.
   Looking forward, we plan to expand our research by incorporating larger datasets, which we believe
will provide a more comprehensive understanding and enable us to fine-tune the models further.
further optimization and fine-tuning of the model parameters are necessary. Potential steps include
hyperparameter tuning and exploring different feature engineering techniques. By doing so, we aim to
achieve higher accuracy and better overall performance in future iterations.


Acknowledgments
The work was done with partial support from the Mexican Government through the grant A1-S-47854
of CONACYT, Mexico, and grants 20241816, 20241819, and 20240951 of the Secretaría de Investigación
y Posgrado of the Instituto Politécnico Nacional, Mexico. The authors thank the CONACYT for the
computing resources brought to them through the Plataforma de Aprendizaje Profundo para Tecnologías
del Lenguaje of the Laboratorio de Supercómputo of the INAOE, Mexico, and acknowledge the support
of Microsoft through Microsoft Latin America PhD Award.


References
 [1] F. Balouchzahi, B. Aparna, H. Shashirekha, Mucs@ lt-edi-eacl2021: Cohope-hope speech detection
     for equality, diversity, and inclusion in code-mixed texts, in: Proceedings of the First Workshop
     on Language Technology for Equality, Diversity and Inclusion, 2021, pp. 180–187.
 [2] A. Gowda, F. Balouchzahi, H. Shashirekha, G. Sidorov, Mucic@ lt-edi-acl2022: Hope speech
     detection using data re-sampling and 1d conv-lstm, in: Proceedings of the second workshop on
     language technology for equality, diversity and inclusion, 2022, pp. 161–166.
 [3] G. Bade, O. Kolesnikova, G. Sidorov, J. Oropeza, Social media fake news classification using
     machine learning algorithm, in: Proceedings of the Fourth Workshop on Speech, Vision, and
     Language Technologies for Dravidian Languages, 2024, pp. 24–29.
 [4] M. Zamir, M. Tash, Z. Ahani, A. Gelbukh, G. Sidorov, Lidoma@ dravidianlangtech 2024: Identifying
     hate speech in telugu code-mixed: A bert multilingual, in: Proceedings of the Fourth Workshop
     on Speech, Vision, and Language Technologies for Dravidian Languages, 2024, pp. 101–106.
 [5] M. Shahiki-Tash, J. Armenta-Segura, Z. Ahani, O. Kolesnikova, G. Sidorov, A. Gelbukh, Lidoma
     at homomex2023@ iberlef: Hate speech detection towards the mexican spanish-speaking lgbt+
     population. the importance of preprocessing before using bert-based models, in: Proceedings of
     the Iberian Languages Evaluation Forum (IberLEF 2023), 2023.
 [6] A. Hande, R. Priyadharshini, A. Sampath, K. P. Thamburaj, P. Chandran, B. R. Chakravarthi, Hope
     speech detection in under-resourced kannada language, arXiv preprint arXiv:2108.04616 (2021).
 [7] M. G. Yigezu, T. Kebede, O. Kolesnikova, G. Sidorov, A. Gelbukh, Habesha@ dravidianlangtech:
     Utilizing deep and transfer learning approaches for sentiment analysis., in: Proceedings of the Third
     Workshop on Speech and Language Technologies for Dravidian Languages, 2023, pp. 239–243.
 [8] M. S. Tash, Z. Ahani, A. Tonja, M. Gemeda, N. Hussain, O. Kolesnikova, Word level language
     identification in code-mixed kannada-english texts using traditional machine learning algorithms,
     in: Proceedings of the 19th International Conference on Natural Language Processing (ICON):
     Shared Task on Word Level Language Identification in Code-mixed Kannada-English Texts, 2022,
     pp. 25–28.
 [9] M. Zamir, M. Tash, Z. Ahani, A. Gelbukh, G. Sidorov, Tayyab@ dravidianlangtech 2024: detecting
     fake news in malayalam lstm approach and challenges, in: Proceedings of the Fourth Workshop
     on Speech, Vision, and Language Technologies for Dravidian Languages, 2024, pp. 113–118.
[10] M. S. Tash, Z. Ahani, O. Kolesnikova, G. Sidorov, Analyzing emotional trends from x platform using
     senticnet: A comparative analysis with cryptocurrency price, arXiv preprint arXiv:2405.03084
     (2024).
[11] A. L. Tonja, M. G. Yigezu, O. Kolesnikova, M. S. Tash, G. Sidorov, A. Gelbuk, Transformer-based
     model for word level language identification in code-mixed kannada-english texts, arXiv preprint
     arXiv:2211.14459 (2022).
[12] M. Yigezu, O. Kolesnikova, G. Sidorov, A. Gelbukh, Habesha@ dravidianlangtech 2024: Detecting
     fake news detection in dravidian languages using deep learning, in: Proceedings of the Fourth
     Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, 2024, pp.
     156–161.
[13] Z. Ahani, M. Tash, M. Zamir, I. Gelbukh, Zavira@ dravidianlangtech 2024: Telugu hate speech
     detection using lstm, in: Proceedings of the Fourth Workshop on Speech, Vision, and Language
     Technologies for Dravidian Languages, 2024, pp. 107–112.
[14] F. Balouchzahi, S. Butt, G. Sidorov, A. Gelbukh, Cic@ lt-edi-acl2022: Are transformers the only
     hope? hope speech detection for spanish and english comments, in: Proceedings of the second
     workshop on language technology for equality, diversity and inclusion, 2022, pp. 206–211.
[15] Z. Ahani, M. Shahiki Tash, Y. Ledo Mezquita, J. Angel, Utilizing deep learning models for the
     identification of enhancers and super-enhancers based on genomic and epigenomic features,
     Journal of Intelligent & Fuzzy Systems (2024) 1–11.
[16] F. Balouchzahi, G. Sidorov, A. Gelbukh, PolyHope: Two-level hope speech detection from tweets,
     Expert Systems with Applications 225 (2023) 120078. doi:10.1016/j.eswa.2023.120078.
[17] M. D. S. S. Eswar, N. Balaji, V. S. Sarma, Y. C. Krishna, S. Thara, Hope speech detection in tamil
     and english language, in: 2022 International Conference on Inventive Computation Technologies
     (ICICT), IEEE, 2022, pp. 51–56.
[18] M. G. Yigezu, G. Y. Bade, O. Kolesnikova, G. Sidorov, A. Gelbukh, Multilingual hope speech
     detection using machine learning (2023).
[19] B. R. Chakravarthi, Hope speech detection in youtube comments, Social Network Analysis and
     Mining 12 (2022) 75.
[20] S. Gundapu, R. Mamidi, Autobots@ lt-edi-eacl2021: one world, one family: hope speech detection
     with bert transformer model, in: Proceedings of the First Workshop on Language Technology for
     Equality, Diversity and Inclusion, 2021, pp. 143–148.
[21] G. Sidorov, F. Balouchzahi, S. Butt, A. Gelbukh, Regret and hope on transformers: An analysis of
     transformers on regret and hope speech detection datasets, Applied Sciences 13 (2023) 3983.
[22] D. García-Baena, F. Balouchzahi, S. Butt, M. Á. García-Cumbreras, A. Lambebo Tonja, J. A. García-
     Díaz, S. Bozkurt, B. R. Chakravarthi, H. G. Ceballos, V.-G. Rafael, G. Sidorov, L. A. Ureña-López,
     A. Gelbukh, S. M. Jiménez-Zafra, Overview of HOPE at IberLEF 2024: Approaching Hope Speech
     Detection in Social Media from Two Perspectives, for Equality, Diversity and Inclusion and as
     Expectations, Procesamiento del Lenguaje Natural 73 (2024).
[23] L. Chiruzzo, S. M. Jiménez-Zafra, F. Rangel, Overview of IberLEF 2024: Natural Language Process-
     ing Challenges for Spanish and other Iberian Languages, in: Proceedings of the Iberian Languages
     Evaluation Forum (IberLEF 2024), co-located with the 40th Conference of the Spanish Society for
     Natural Language Processing (SEPLN 2024), CEUR-WS.org, 2024.
[24] D. García-Baena, M. Á. García-Cumbreras, S. M. Jiménez-Zafra, J. A. García-Díaz, R. Valencia-
     García, Hope speech detection in Spanish: The LGBT case, Language Resources and Evaluation
     (2023) 1–28.
[25] D. Khanna, M. Singh, P. Motlicek, Idiap_tiet@ lt-edi-acl2022: Hope speech detection in social media
     using contextualized bert with attention mechanism, in: Proceedings of the Second Workshop on
     Language Technology for Equality, Diversity and Inclusion, 2022, pp. 321–325.
[26] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers
     for language understanding, arXiv preprint arXiv:1810.04805 (2018).
[27] F. Ullah, M. Zamir, M. Arif, M. Ahmad, E. Felipe-Riveron, A. Gelbukh, Fida@ dravidianlangtech
     2024: A novel approach to hate speech detection using distilbert-base-multilingual-cased, in:
     Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian
     Languages, 2024, pp. 85–90.