<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Seminary of Computer Science Research at Feminine, March</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Exploring Sentiment Analysis in Health with CNN, GRU and Zero-Shot Learning on Imbalanced Textual Data.</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Biomedical Engineering Laboratory, University of Tlemcen</institution>
          ,
          <country country="DZ">Algeria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>9</volume>
      <issue>2023</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Sentiment analysis has recently gained popularity due to the emergence of social media and the semantic web, which provide platforms for individuals to express their thoughts and emotions. In the field of health, analyzing this type of data can have numerous benefits, particularly in psychology, where it can aid in automatically identifying a patient's mental state. In this study, we present a method for categorizing emotions from text data in the health field. We evaluated various classifiers such as: Convolutional Neural Networks (CNN), Gated Recurrent Units (GRU), and Zero-Shot learning (ZSL) on the unbalanced EmoHD dataset. Our findings showed that the CNN model gives the best performance[1].</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Sentiment analysis</kwd>
        <kwd>CNN</kwd>
        <kwd>GRU</kwd>
        <kwd>Zero-Shot learning</kwd>
        <kwd>classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Souaad HAMZA-CHERIF1, Nesma SETTOUTI1,2</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>Sentiments are complex phenomena that can take many forms, such as emotions, reactions to
events, or mental states. They can be positive or negative, joyful or boring, sad, etc.</p>
      <p>Sentiment analysis has become a rapidly growing field, particularly with the rise of social
media and the semantic web, which make it easier for people to express their thoughts and
emotions through text, emoticons, and other means.</p>
      <p>Machine understanding and analysis of sentiments is particularly important in the field of
health, where a patient’s emotional state can greatly impact their healing process. For example, a
patient sufering from depression may not be in a favorable mental state to begin their recovery.
Similarly, in the field of psychology, automatic sentiment analysis can be very beneficial in
identifying a patient’s psychological state as well as the diferent human personality traits that
can explain human reasoning and behavior.</p>
      <p>In this context, artificial learning techniques and natural language processing (NLP) ofer
promising tools which have proven themselves to provide automatic solutions in diferent fields
such as health because they play an important role in understanding the meaning of content for
classifying, analyzing, and predicting human sentiments by machine. In this article, we present
an approach for sentiment analysis from text data in the field of health. We compare various
machine learning methods, including convolutional neural networks (CNN), gated recurrent
units (GRU), and zero-shot learning (ZSL).</p>
      <p>The rest of the article is structured as follows. In section 2, we will discuss related work in the
ifeld, then we will present our proposed approach and research strategy in section 3. In section
4, we will analyze the results we obtained and conclude with perspectives and future works.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related works</title>
      <p>
        Several recent studies have focused on using artificial learning for sentiment analysis, one
notable example is in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], where the authors used a lexicon-based classification approach to
predict the results of the 2016 U.S. Presidential Election between Hillary Clinton and Donald
Trump using Twitter data.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], the authors analyzed sentiments towards AstraZeneca, Pfizer and Moderna COVID-19
vaccines using Twitter data and the AFINN lexicon. The results showed positive sentiment
towards Pfizer and Moderna vaccines. In [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] the authors used the Word2vec model to compute
vector representations of words and then applied the Convolutional Neural Network CNN
model to analyze sentiment from a corpus of movie review excerpts that includes five labels
(negative, rather negative, neutral rather positive and positive) and achieved a test accuracy of
45.4%.
      </p>
      <p>
        Other studies have also applied sentiment analysis to health-related textual data. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the
authors used term frequency-inverse document frequency (TF-IDF) vectorization and compared
the performance of diferent classifiers for emotion classification on the EmoHD dataset. In [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
the authors used the Intelligent Water Drop algorithm to select features of interest from the
EmoHD dataset and classify emotions in the health domain.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the authors used deep learning models for Arabic aspect-based sentiment analysis. To
take advantage of both word and character representations, and extract the main opinion aspect,
they combined a bidirectional GRU, a Convolutional Neural Network (CNN), and a Conditional
Random Field (CRF), and then used an interactive attention network based on a bidirectional
GRU (IAN-BGRU) to identify the sentiment polarity towards the extracted aspects.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], the authors trained a unified model that performs zero-shot aspect-based sentiment
analysis (ABSA) without using any annotated data for a new domain using Zero-Shot Learning.
They evaluated it for end-to-end aspect-based sentiment analysis (E2E ABSA), which shows
that ABSA can be conducted without any human-annotated ABSA data.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] the authors also use the modified GRU model for the classification and understanding
of sentiments from tweets from an online site and compared it to the long-short-term memory
model (LSTM) and the bidirectional long-short-term memoryterm (BiLSTM) and demonstrated
that the modified GRU outperformed them. In [ 10], the authors propose a two-stage
emotion detection methodology. In the first stage, they use a Zero-Shot Learning model based
on a sentence transformer, returning the probabilities for subsets of 34 emotions. Then, they
used the output of the zero-shot model as an input for the second stage, which trains a
machine learning classifier on the sentiment labels in a supervised manner using ensemble learning.
      </p>
      <p>In this work, we compare diferent approaches for classifying sentiments from health-related
textual data (EmoHD) using CNN, GRU, and ZSL models. Our primary goal is to explore other
deep learning models on the EmoHD dataset and evaluate their performance especially since
these models have demonstrated their efectiveness in the field. Then, we aim to evaluate the
annotation of EmoHD dataset using the Zero-Shot Learning model.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Proposed approach</title>
      <p>
        In this article, we present a method for classifying sentiments from text data in the EmoHD
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] health dataset. The EmoHD dataset is composed of 4,202 text samples from eight disease
classes and six emotion classes, collected from various online sources. Our approach, as shown
in figure 1, involves several steps: first, we pre-process the data to improve the classification.
As the EmoHD dataset is unbalanced, we then perform resampling to balance the data. Finally,
we classify emotions by experimenting with three methods: CNN, GRU, and ZSL.
      </p>
      <sec id="sec-4-1">
        <title>3.1. Data preprocessing</title>
        <p>To classify the text data in the EmoHD database, it is necessary to perform a series of
preprocessing steps to remove any noise present in the data and improve the classification accuracy.
The main steps included are:
• Lowercase conversion:</p>
        <p>Converting all terms to lowercase is necessary to avoid duplicates, as even though the
meaning of words such as "Health" and "health" are the same, they are treated as separate
lexical units if they are in diferent cases.
• Removal of punctuation:</p>
        <p>Removing all punctuations from the text that do not provide any useful information is a
process that helps to improve the performance of data classification.
• Remove stop words: These are words that are very common in the language being
studied but do not provide any informative value for understanding the "meaning" of a
document or corpus, so they are removed.
• Removal of rare and common words:</p>
        <p>To avoid the negative impact that rare and common words can have on the classification,
we remove them by counting their frequency of appearance in the text, this helps in
reducing the noise generated by them in the text.
• Lemmatization:</p>
        <p>It consists of replacing each word with its canonical form, for example, the word "known" is
replaced with its canonical form "know". This step is useful for the thematic classification
of texts because it treats diferent variants resulting from the same form or root as a single
word.
• Tokenization:</p>
        <p>It’s the act of parsing text into tokens, in other words, the text is segmented into
linguistic units such as words, punctuations, numbers, alphanumeric data... Each element
corresponding to a token which will be useful for the to analyse.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Data re-sampling</title>
        <p>The EmoHD dataset has six class labels: Angry (1343), Excited (1215), Fear (742), Happy (522),
Sad (358), and Bored (22). As the data is unbalanced, it is necessary to perform resampling to
balance the data. Our approach involves using naive oversampling by randomly duplicating
observations from the minority class in order to increase their representation. This is done by
re-sampling with replacement.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Classification models</title>
        <p>• Convolution Neural Network (CNN)</p>
        <p>
          Convolutional Neural Networks (CNNs) are types of deep, feedforward artificial neural
networks composed of layers of nodes, including an input layer, one or more hidden
layers, and an output layer [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Each node connects to another node and has an associated
weight and threshold. If the output of an individual node exceeds the designated threshold
value, that node is activated, and data is passed to the next layer of the network, otherwise,
no data is passed to the next network layer. In our approach, we implemented a CNN
with 5 layers, which are:
– The Embedding Layer: This is the input layer that maps words in the text to
real-valued vectors. It takes three inputs: the vocabulary size dimension of each
embedded word (input_dim), the maximum number of words in the vocabulary
(max_feature), and the maximum length of a sequence (input_length).
– Spatial Dropout Layer: This layer is used to reduce overfitting during the training
of the model by applying random deactivation at each epoch. This means that
during each pass (forward propagation), the model learns with a configuration of
diferent neurons activating and deactivating randomly.
– 1D Convolution Layer (Temporal Convolution): This layer creates a convolution
kernel that is convolved with the input over a single spatial (or temporal) dimension
to produce a tensor of outputs.
– MaxPooling1D Layer: This layer down-samples the input representation by taking
the maximum value over a spatial window of size pool_size
– The Dense Layer: This is the output layer, which implements the activation
function. In our case, we used the softmax which is often used during the multiclass
problem as in our case function.
        </p>
        <p>Next, we have compile our model. Compiling the model takes three parameters: optimizer,
loss and metrics. The optimizer controls the learning rate. We will be using ‘adam’ as our
optmizer. Adam is generally a good optimizer to use for many cases. The adam optimizer
adjusts the learning rate throughout training. The learning rate determines how fast the
optimal weights for the model are calculated. We used ‘categorical_crossentropy’ for our
loss function. This is the most common choice for classification. A lower score indicates
that the model is performing better.And we used the ‘accuracy’ metric to see the accuracy
score on the validation set when we train the model.
• Gated Recurring Unit (GRU)</p>
        <p>GRU (Gated Recurrent Unit) is an improved version of the standard recurrent neural
network that was introduced in [11]. It has some advantages over long-term memory
(LSTM) in certain cases, such as using less memory and being faster.</p>
        <p>GRUs solve the leakage gradient problem of a standard RNN by using two gates, the
update gate and the reset gate. These gates are two vectors that decide what information
is allowed to pass to the output and can be trained to retain information from far back in
time. This allows for relevant information to be passed along a chain of events for better
predictions.</p>
        <p>The GRU model implemented in our approach has 3 layers: an embedding layer (defined
in the previous section), a GRU layer (this layer is a fully connected layer with Gated
recurrent units instead of simple neurons (we used 64 GRU), which are an improved
version of standard recurrent neural network. The GRU is similar to a long short-term
memory (LSTM) but with fewer parameters), and a dense layer with a softmax function
activation.</p>
        <p>Next, we compiled our model with the three parameters: optimizer, loss, and metric
(previously quoted in the CNN model).
• Zero Shot Learning (ZSL)</p>
        <p>Zero-shot learning (ZSL) is the ability to perform a task without any prior training
examples. It is used to build models for classes that have not yet been labeled. ZSL
transfers knowledge from known classes to new classes using class attributes as the basis
of this transfer. The process of ZSL has two phases:
– Training: Gaining knowledge about the attributes
– Inference: Utilizing the knowledge to classify examples into new classes.
In our approach, we implement zero-shot classification by using the transformer library
on the EmoHD dataset. Our proposed model takes the candidate labels ("Angry", "Bored",
"Happy", "Sad", "Excited", "Fear") from the EmoHD dataset and the text vectorized by
the sentence transformer. The output of the zero-shot model is a list of emotion labels
mapped to their probabilities for the given input text.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experiments and results</title>
      <p>In this research, we evaluated the performance of GRU, CNN, and Zero-Shot Learning models on
the EmoHD dataset. Additionally, we analyzed the labels of the EmoHD dataset using Zero-Shot
Learning. For our testing procedure, we split the data into a training set (80%) and a validation
set (20%).</p>
      <p>To evaluate the performance of each model, we used the F1-score, which is a commonly used
evaluation metric that is more appropriate for datasets with imbalanced classes than accuracy.
The F1-score is a combination of precision and recall, and takes into account both false positives
and false negatives. The formula for the F1-score is represented in equation 1.
 
 1 −  = 2 × (2 ×   +   +   )
(1)
We trained various machine learning algorithms as described in the "Classification models"
section and evaluated their performance on the EmoHD dataset on 12 epochs. In our study, we
evaluated the classification models with and without re-sampling the data. The results of these
evaluations are presented in Table 1.</p>
      <p>From the results in table 1, it can be seen that the CNN model achieved the highest score of
82% while the GRU model had a score of 80% after re-sampling the data. Thus, we concluded
that re-sampling improves the classification performance.</p>
      <p>We also noticed poor performance of the Zero-shot Learning model. To understand these
results, we randomly selected 50 text samples from the EmoHD dataset and classified them
according to the output class labels (Angry, Excited, Happy, Fear, Sad, Bored) to compare the
results of the ZSL model with the existing labels in EmoHD. An example of this can be seen in
Table 2.</p>
      <p>From the 50 samples, only 12 predictions matched the labels of EmoHD. We also observed
that the confusing labels were particularly around the "Excited" label which was frequently
confused with other labels.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion and perspectives</title>
      <p>In this study, we proposed a method for sentiment analysis using unbalanced textual health
data. Our research focused on examining the efect of unbalanced data on text classification
and found that re-sampling improves classification performance. Additionally, we found that
the CNN model did not perform better than other models in terms of F1-score. Furthermore, we
karachi dengue fever claim another life frontier
post may fp report karachi woman died dengue
fever private hospital karachi total number death
fatal fever karachi reached two according detail
deceased woman identified balqees bin qasim
locality karachi toll infected patient city reached
since beginning year according dengue
surveillance cell total number dengue case far reported
karachi month may.</p>
      <p>Score label ZSL</p>
      <p>EmoHD label
0.278:(Sad)
0.270:(Fear)
0.188:(Excited)
0.096:(Angry)
0.095:(Bored)
0.070:(Happy)</p>
      <p>Fear
discovered that the similarities in semantics among the labels used in the EmoHD dataset led
to confusion, particularly with the term "Excited" which can be mistaken as both "angry" and
"happy". As next steps, we plan to continue our experiments in sentiment analysis and explore
the possibility of creating new textual databases from web data.
[10] S. G. Tesfagergish, J. Kapociute-Dzikiene, R. Damasvicius, Zero-shot emotion
detection for semi-supervised sentiment analysis using sentence transformers and ensemble
learning, Applied Sciences 12 (2022). URL: https://www.mdpi.com/2076-3417/12/17/8662.
doi:10.3390/app12178662.
[11] J. Chung, Ç. Gülçehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural
networks on sequence modeling, CoRR abs/1412.3555 (2014). URL: http://arxiv.org/abs/
1412.3555. arXiv:1412.3555.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , V. Saligrama,
          <article-title>Zero-shot learning via semantic similarity embedding</article-title>
          ,
          <source>CoRR abs/1509</source>
          .04767 (
          <year>2015</year>
          ). URL: http://arxiv.org/abs/1509.04767. arXiv:
          <volume>1509</volume>
          .
          <fpage>04767</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Srinivasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sangwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Neill</surname>
          </string-name>
          , T. Zu,
          <article-title>Twitter data for predicting election results: Insights from emotion classification</article-title>
          ,
          <source>IEEE Technology and Society Magazine</source>
          <volume>38</volume>
          (
          <year>2019</year>
          )
          <fpage>58</fpage>
          -
          <lpage>63</lpage>
          . doi:
          <volume>10</volume>
          .1109/MTS.
          <year>2019</year>
          .
          <volume>2894472</volume>
          ,
          <string-name>
            <surname>publisher</surname>
            <given-names>Copyright</given-names>
          </string-name>
          : ©
          <year>2019</year>
          IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Marcec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Likic</surname>
          </string-name>
          ,
          <article-title>Using twitter for sentiment analysis towards astrazeneca/oxford</article-title>
          , pfizer/biontech and moderna covid-19
          <source>vaccines 98</source>
          (
          <year>2022</year>
          )
          <fpage>544</fpage>
          -
          <lpage>550</lpage>
          . URL: https:// pmj.bmj.com/content/98/1161/544. doi:
          <volume>10</volume>
          .1136/postgradmedj-2021-140685.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>[4] Analyse des sentiments à l'aide d'un réseau de neurones convolutifs</article-title>
          (????)
          <fpage>2359</fpage>
          -
          <lpage>2364</lpage>
          . doi:
          <volume>10</volume>
          .1109/CIT/IUCC/DASC/PICOM.
          <year>2015</year>
          .
          <volume>349</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Azam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. U.</given-names>
            <surname>Haq</surname>
          </string-name>
          ,
          <article-title>Automatic emotion recognition in healthcare data using supervised machine learning</article-title>
          ,
          <source>PeerJ Computer Science</source>
          <volume>7</volume>
          (
          <year>2021</year>
          )
          <article-title>e751</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G. B.</given-names>
            <surname>Mohammad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Potluri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tiwari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Srihari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Dekeba</surname>
          </string-name>
          , et al.,
          <article-title>An artificial intelligence-based reactive health care system for emotion detections</article-title>
          ,
          <source>Computational Intelligence and Neuroscience</source>
          <year>2022</year>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mustafa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. H. A.</given-names>
            <surname>Soliman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. I.</given-names>
            <surname>Taloba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Seedik</surname>
          </string-name>
          ,
          <article-title>Arabic aspect based sentiment analysis using bidirectional GRU based models</article-title>
          ,
          <source>CoRR abs/2101</source>
          .10539 (
          <year>2021</year>
          ). URL: https: //arxiv.org/abs/2101.10539. arXiv:
          <volume>2101</volume>
          .
          <fpage>10539</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Zero-shot aspect-based sentiment analysis</article-title>
          ,
          <source>CoRR abs/2202</source>
          .
          <year>01924</year>
          (
          <year>2022</year>
          ). URL: https://arxiv.org/abs/2202.
          <year>01924</year>
          . arXiv:
          <fpage>2202</fpage>
          .
          <year>01924</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>[9] Analyse des sentiments à l'aide de gru modifié, in: Actes de la quatorzième conférence internationale 2022 sur l'informatique contemporaine</article-title>
          , ????, p.
          <fpage>356</fpage>
          -
          <lpage>361</lpage>
          . URL: https:// doi.org/10.1145/3549206.3549270. doi:
          <volume>10</volume>
          .1145/3549206.3549270.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>