Exploring Sentiment Analysis in Health with CNN,
                                GRU and Zero-Shot Learning on Imbalanced Textual
                                Data.
                                Souaad HAMZA-CHERIF1 , Nesma SETTOUTI1,2
                                1
                                    Biomedical Engineering Laboratory, University of Tlemcen, Algeria
                                2
                                    L@bISEN, Yncréa ouest, Brest, France


                                                                         Abstract
                                                                         Sentiment analysis has recently gained popularity due to the emergence of social media and the semantic
                                                                         web, which provide platforms for individuals to express their thoughts and emotions. In the field of health,
                                                                         analyzing this type of data can have numerous benefits, particularly in psychology, where it can aid in
                                                                         automatically identifying a patient’s mental state. In this study, we present a method for categorizing
                                                                         emotions from text data in the health field. We evaluated various classifiers such as: Convolutional
                                                                         Neural Networks (CNN), Gated Recurrent Units (GRU), and Zero-Shot learning (ZSL) on the unbalanced
                                                                         EmoHD dataset. Our findings showed that the CNN model gives the best performance[1].


                                                                         Keywords
                                                                         Sentiment analysis, CNN, GRU, Zero-Shot learning, classification.


                                1. Introduction
                                Sentiments are complex phenomena that can take many forms, such as emotions, reactions to
                                events, or mental states. They can be positive or negative, joyful or boring, sad, etc.
                                   Sentiment analysis has become a rapidly growing field, particularly with the rise of social
                                media and the semantic web, which make it easier for people to express their thoughts and
                                emotions through text, emoticons, and other means.
                                   Machine understanding and analysis of sentiments is particularly important in the field of
                                health, where a patient’s emotional state can greatly impact their healing process. For example, a
                                patient suffering from depression may not be in a favorable mental state to begin their recovery.
                                Similarly, in the field of psychology, automatic sentiment analysis can be very beneficial in
                                identifying a patient’s psychological state as well as the different human personality traits that
                                can explain human reasoning and behavior.
                                In this context, artificial learning techniques and natural language processing (NLP) offer
                                promising tools which have proven themselves to provide automatic solutions in different fields
                                such as health because they play an important role in understanding the meaning of content for
                                classifying, analyzing, and predicting human sentiments by machine. In this article, we present
                                RIF 2023 : The 12th Seminary of Computer Science Research at Feminine, March 9, 2023, Constantine, Algeria
                                $ souad.hamzacherif@univ-tlemcen.dz (S. HAMZA-CHERIF); nesma.settouti@univ-tlemcen.dz (N. SETTOUTI)
                                 0000-0002-4733-197X (S. HAMZA-CHERIF); 0000-0002-7423-0090 (N. SETTOUTI)
                                                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
an approach for sentiment analysis from text data in the field of health. We compare various
machine learning methods, including convolutional neural networks (CNN), gated recurrent
units (GRU), and zero-shot learning (ZSL).
   The rest of the article is structured as follows. In section 2, we will discuss related work in the
field, then we will present our proposed approach and research strategy in section 3. In section
4, we will analyze the results we obtained and conclude with perspectives and future works.


2. Related works
Several recent studies have focused on using artificial learning for sentiment analysis, one
notable example is in [2], where the authors used a lexicon-based classification approach to
predict the results of the 2016 U.S. Presidential Election between Hillary Clinton and Donald
Trump using Twitter data.
   In [3], the authors analyzed sentiments towards AstraZeneca, Pfizer and Moderna COVID-19
vaccines using Twitter data and the AFINN lexicon. The results showed positive sentiment
towards Pfizer and Moderna vaccines. In [4] the authors used the Word2vec model to compute
vector representations of words and then applied the Convolutional Neural Network CNN
model to analyze sentiment from a corpus of movie review excerpts that includes five labels
(negative, rather negative, neutral rather positive and positive) and achieved a test accuracy of
45.4%.
   Other studies have also applied sentiment analysis to health-related textual data. In [5], the
authors used term frequency-inverse document frequency (TF-IDF) vectorization and compared
the performance of different classifiers for emotion classification on the EmoHD dataset. In [6],
the authors used the Intelligent Water Drop algorithm to select features of interest from the
EmoHD dataset and classify emotions in the health domain.
   In [7], the authors used deep learning models for Arabic aspect-based sentiment analysis. To
take advantage of both word and character representations, and extract the main opinion aspect,
they combined a bidirectional GRU, a Convolutional Neural Network (CNN), and a Conditional
Random Field (CRF), and then used an interactive attention network based on a bidirectional
GRU (IAN-BGRU) to identify the sentiment polarity towards the extracted aspects.
   In [8], the authors trained a unified model that performs zero-shot aspect-based sentiment
analysis (ABSA) without using any annotated data for a new domain using Zero-Shot Learning.
They evaluated it for end-to-end aspect-based sentiment analysis (E2E ABSA), which shows
that ABSA can be conducted without any human-annotated ABSA data.
   In [9] the authors also use the modified GRU model for the classification and understanding
of sentiments from tweets from an online site and compared it to the long-short-term memory
model (LSTM) and the bidirectional long-short-term memoryterm (BiLSTM) and demonstrated
that the modified GRU outperformed them. In [10], the authors propose a two-stage emo-
tion detection methodology. In the first stage, they use a Zero-Shot Learning model based
on a sentence transformer, returning the probabilities for subsets of 34 emotions. Then, they
used the output of the zero-shot model as an input for the second stage, which trains a ma-
chine learning classifier on the sentiment labels in a supervised manner using ensemble learning.
   In this work, we compare different approaches for classifying sentiments from health-related
textual data (EmoHD) using CNN, GRU, and ZSL models. Our primary goal is to explore other
deep learning models on the EmoHD dataset and evaluate their performance especially since
these models have demonstrated their effectiveness in the field. Then, we aim to evaluate the
annotation of EmoHD dataset using the Zero-Shot Learning model.


3. Proposed approach
In this article, we present a method for classifying sentiments from text data in the EmoHD
[5] health dataset. The EmoHD dataset is composed of 4,202 text samples from eight disease
classes and six emotion classes, collected from various online sources. Our approach, as shown
in figure 1, involves several steps: first, we pre-process the data to improve the classification.
As the EmoHD dataset is unbalanced, we then perform resampling to balance the data. Finally,
we classify emotions by experimenting with three methods: CNN, GRU, and ZSL.


Figure 1: Sentiment classification and analysis on imbalanced Health-Related textual data framework


3.1. Data preprocessing
To classify the text data in the EmoHD database, it is necessary to perform a series of pre-
processing steps to remove any noise present in the data and improve the classification accuracy.
The main steps included are:
    • Lowercase conversion:
      Converting all terms to lowercase is necessary to avoid duplicates, as even though the
      meaning of words such as "Health" and "health" are the same, they are treated as separate
      lexical units if they are in different cases.
    • Removal of punctuation:
      Removing all punctuations from the text that do not provide any useful information is a
      process that helps to improve the performance of data classification.
    • Remove stop words: These are words that are very common in the language being
      studied but do not provide any informative value for understanding the "meaning" of a
      document or corpus, so they are removed.
    • Removal of rare and common words:
      To avoid the negative impact that rare and common words can have on the classification,
      we remove them by counting their frequency of appearance in the text, this helps in
      reducing the noise generated by them in the text.
    • Lemmatization:
      It consists of replacing each word with its canonical form, for example, the word "known" is
      replaced with its canonical form "know". This step is useful for the thematic classification
      of texts because it treats different variants resulting from the same form or root as a single
      word.
    • Tokenization:
      It’s the act of parsing text into tokens, in other words, the text is segmented into lin-
      guistic units such as words, punctuations, numbers, alphanumeric data... Each element
      corresponding to a token which will be useful for the to analyse.

3.2. Data re-sampling
The EmoHD dataset has six class labels: Angry (1343), Excited (1215), Fear (742), Happy (522),
Sad (358), and Bored (22). As the data is unbalanced, it is necessary to perform resampling to
balance the data. Our approach involves using naive oversampling by randomly duplicating
observations from the minority class in order to increase their representation. This is done by
re-sampling with replacement.

3.3. Classification models
    • Convolution Neural Network (CNN)
      Convolutional Neural Networks (CNNs) are types of deep, feedforward artificial neural
      networks composed of layers of nodes, including an input layer, one or more hidden
      layers, and an output layer [4]. Each node connects to another node and has an associated
      weight and threshold. If the output of an individual node exceeds the designated threshold
      value, that node is activated, and data is passed to the next layer of the network, otherwise,
      no data is passed to the next network layer. In our approach, we implemented a CNN
      with 5 layers, which are:
         – The Embedding Layer: This is the input layer that maps words in the text to
           real-valued vectors. It takes three inputs: the vocabulary size dimension of each
           embedded word (input_dim), the maximum number of words in the vocabulary
           (max_feature), and the maximum length of a sequence (input_length).
         – Spatial Dropout Layer: This layer is used to reduce overfitting during the training
           of the model by applying random deactivation at each epoch. This means that
           during each pass (forward propagation), the model learns with a configuration of
           different neurons activating and deactivating randomly.
         – 1D Convolution Layer (Temporal Convolution): This layer creates a convolution
           kernel that is convolved with the input over a single spatial (or temporal) dimension
           to produce a tensor of outputs.
         – MaxPooling1D Layer: This layer down-samples the input representation by taking
           the maximum value over a spatial window of size pool_size
     – The Dense Layer: This is the output layer, which implements the activation func-
       tion. In our case, we used the softmax which is often used during the multiclass
       problem as in our case function.


  Next, we have compile our model. Compiling the model takes three parameters: optimizer,
  loss and metrics. The optimizer controls the learning rate. We will be using ‘adam’ as our
  optmizer. Adam is generally a good optimizer to use for many cases. The adam optimizer
  adjusts the learning rate throughout training. The learning rate determines how fast the
  optimal weights for the model are calculated. We used ‘categorical_crossentropy’ for our
  loss function. This is the most common choice for classification. A lower score indicates
  that the model is performing better.And we used the ‘accuracy’ metric to see the accuracy
  score on the validation set when we train the model.
• Gated Recurring Unit (GRU)

  GRU (Gated Recurrent Unit) is an improved version of the standard recurrent neural
  network that was introduced in [11]. It has some advantages over long-term memory
  (LSTM) in certain cases, such as using less memory and being faster.
  GRUs solve the leakage gradient problem of a standard RNN by using two gates, the
  update gate and the reset gate. These gates are two vectors that decide what information
  is allowed to pass to the output and can be trained to retain information from far back in
  time. This allows for relevant information to be passed along a chain of events for better
  predictions.
  The GRU model implemented in our approach has 3 layers: an embedding layer (defined
  in the previous section), a GRU layer (this layer is a fully connected layer with Gated
  recurrent units instead of simple neurons (we used 64 GRU), which are an improved
  version of standard recurrent neural network. The GRU is similar to a long short-term
  memory (LSTM) but with fewer parameters), and a dense layer with a softmax function
  activation.
  Next, we compiled our model with the three parameters: optimizer, loss, and metric
  (previously quoted in the CNN model).
• Zero Shot Learning (ZSL)
  Zero-shot learning (ZSL) is the ability to perform a task without any prior training
  examples. It is used to build models for classes that have not yet been labeled. ZSL
  transfers knowledge from known classes to new classes using class attributes as the basis
  of this transfer. The process of ZSL has two phases:
     – Training: Gaining knowledge about the attributes
     – Inference: Utilizing the knowledge to classify examples into new classes.
  In our approach, we implement zero-shot classification by using the transformer library
  on the EmoHD dataset. Our proposed model takes the candidate labels ("Angry", "Bored",
  "Happy", "Sad", "Excited", "Fear") from the EmoHD dataset and the text vectorized by
  the sentence transformer. The output of the zero-shot model is a list of emotion labels
  mapped to their probabilities for the given input text.
Table 1
F1-score implemented models
                                  Model             CNN     GRU     ZSL
                          with re-sampling data      0.45    0.40   0.20
                         without re-sampling data    0.82    0.80   0.31


4. Experiments and results
In this research, we evaluated the performance of GRU, CNN, and Zero-Shot Learning models on
the EmoHD dataset. Additionally, we analyzed the labels of the EmoHD dataset using Zero-Shot
Learning. For our testing procedure, we split the data into a training set (80%) and a validation
set (20%).
   To evaluate the performance of each model, we used the F1-score, which is a commonly used
evaluation metric that is more appropriate for datasets with imbalanced classes than accuracy.
The F1-score is a combination of precision and recall, and takes into account both false positives
and false negatives. The formula for the F1-score is represented in equation 1.

                                                       𝑇𝑃
                          𝐹 1 − 𝑠𝑐𝑜𝑟𝑒 = 2 ×                                                    (1)
                                              (2 × 𝑇 𝑃 + 𝐹 𝑃 + 𝐹 𝑁 )

We trained various machine learning algorithms as described in the "Classification models"
section and evaluated their performance on the EmoHD dataset on 12 epochs. In our study, we
evaluated the classification models with and without re-sampling the data. The results of these
evaluations are presented in Table 1.
   From the results in table 1, it can be seen that the CNN model achieved the highest score of
82% while the GRU model had a score of 80% after re-sampling the data. Thus, we concluded
that re-sampling improves the classification performance.
   We also noticed poor performance of the Zero-shot Learning model. To understand these
results, we randomly selected 50 text samples from the EmoHD dataset and classified them
according to the output class labels (Angry, Excited, Happy, Fear, Sad, Bored) to compare the
results of the ZSL model with the existing labels in EmoHD. An example of this can be seen in
Table 2.

  From the 50 samples, only 12 predictions matched the labels of EmoHD. We also observed
that the confusing labels were particularly around the "Excited" label which was frequently
confused with other labels.


5. Conclusion and perspectives
In this study, we proposed a method for sentiment analysis using unbalanced textual health
data. Our research focused on examining the effect of unbalanced data on text classification
and found that re-sampling improves classification performance. Additionally, we found that
the CNN model did not perform better than other models in terms of F1-score. Furthermore, we
Table 2
Example of label comparison with ZSL.
        Text                                                Score label ZSL   EmoHD label
        karachi dengue fever claim another life frontier    0.278:(Sad)       Fear
        post may fp report karachi woman died dengue        0.270:(Fear)
        fever private hospital karachi total number death   0.188:(Excited)
        fatal fever karachi reached two according detail    0.096:(Angry)
        deceased woman identified balqees bin qasim lo-     0.095:(Bored)
        cality karachi toll infected patient city reached   0.070:(Happy)
        since beginning year according dengue surveil-
        lance cell total number dengue case far reported
        karachi month may.


discovered that the similarities in semantics among the labels used in the EmoHD dataset led
to confusion, particularly with the term "Excited" which can be mistaken as both "angry" and
"happy". As next steps, we plan to continue our experiments in sentiment analysis and explore
the possibility of creating new textual databases from web data.


References
 [1] Z. Zhang, V. Saligrama, Zero-shot learning via semantic similarity embedding, CoRR
     abs/1509.04767 (2015). URL: http://arxiv.org/abs/1509.04767. arXiv:1509.04767.
 [2] S. Srinivasan, R. Sangwan, C. Neill, T. Zu, Twitter data for predicting election results:
     Insights from emotion classification, IEEE Technology and Society Magazine 38 (2019)
     58–63. doi:10.1109/MTS.2019.2894472, publisher Copyright: © 2019 IEEE.
 [3] R. Marcec, R. Likic, Using twitter for sentiment analysis towards astrazeneca/oxford,
     pfizer/biontech and moderna covid-19 vaccines 98 (2022) 544–550. URL: https://
     pmj.bmj.com/content/98/1161/544. doi:10.1136/postgradmedj-2021-140685.
 [4] Analyse des sentiments à l’aide d’un réseau de neurones convolutifs (????) 2359–2364.
     doi:10.1109/CIT/IUCC/DASC/PICOM.2015.349.
 [5] N. Azam, T. Ahmad, N. U. Haq, Automatic emotion recognition in healthcare data using
     supervised machine learning, PeerJ Computer Science 7 (2021) e751.
 [6] G. B. Mohammad, S. Potluri, A. Kumar, R. Tiwari, R. Shrivastava, S. Kumar, K. Srihari,
     K. Dekeba, et al., An artificial intelligence-based reactive health care system for emotion
     detections, Computational Intelligence and Neuroscience 2022 (2022).
 [7] M. Mustafa, T. H. A. Soliman, A. I. Taloba, M. F. Seedik, Arabic aspect based sentiment
     analysis using bidirectional GRU based models, CoRR abs/2101.10539 (2021). URL: https:
     //arxiv.org/abs/2101.10539. arXiv:2101.10539.
 [8] L. Shu, H. Xu, B. Liu, J. Chen, Zero-shot aspect-based sentiment analysis, CoRR
     abs/2202.01924 (2022). URL: https://arxiv.org/abs/2202.01924. arXiv:2202.01924.
 [9] Analyse des sentiments à l’aide de gru modifié, in: Actes de la quatorzième conférence
     internationale 2022 sur l’informatique contemporaine, ????, p. 356–361. URL: https://
     doi.org/10.1145/3549206.3549270. doi:10.1145/3549206.3549270.
[10] S. G. Tesfagergish, J. Kapociute-Dzikiene, R. Damasvicius, Zero-shot emotion detec-
     tion for semi-supervised sentiment analysis using sentence transformers and ensemble
     learning, Applied Sciences 12 (2022). URL: https://www.mdpi.com/2076-3417/12/17/8662.
     doi:10.3390/app12178662.
[11] J. Chung, Ç. Gülçehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural
     networks on sequence modeling, CoRR abs/1412.3555 (2014). URL: http://arxiv.org/abs/
     1412.3555. arXiv:1412.3555.