<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>November</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>A comparative study of deep learning models for sentiment analysis of social media texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vasily D. Derbentsev</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vitalii S. Bezkorovainyi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andriy V. Matviychuk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oksana M. Pomazun</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrii V. Hrabariev</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexey M. Hostryk</string-name>
          <email>alexeyGostrik@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kryvyi Rih State Pedagogical University</institution>
          ,
          <addr-line>54 Gagarin Ave., Kryvyi Rih, 50086</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Kyiv National Economic University named after Vadym Hetman</institution>
          ,
          <addr-line>54/1 Peremogy Ave., Kyiv, 03680</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Odessa National Economic University</institution>
          ,
          <addr-line>8 Preobrazhenskaya Str., Odessa, 65082</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <fpage>7</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>Sentiment analysis is a challenging task in natural language processing, especially for social media texts, which are often informal, short, and noisy. In this paper, we present a comparative study of deep learning models for sentiment analysis of social media texts. We develop three models based on deep neural networks (DNNs): a convolutional neural network (CNN), a CNN with long short-term memory (LSTM) layers (CNN-LSTM), and a bidirectional LSTM with CNN layers (BiLSTM-CNN). We use GloVe and Word2vec word embeddings as vector representations of words. We evaluate the performance of the models on two datasets: IMDb Movie Reviews and Twitter Sentiment 140. We also compare the results with a logistic regression classifier as a baseline. The experimental results show that the CNN model achieves the best accuracy of 90.1% on the IMDb dataset, while the BiLSTM-CNN model achieves the best accuracy of 82.1% on the Sentiment 140 dataset. The proposed models are comparable to state-of-the-art models and suitable for practical use in sentiment analysis of social media texts. sentiment analysis, social media, deep learning, convolutional neural networks, long short-term memory, NLP resides at the crossroads of Computer Science, Artificial Intelligence, and Linguistics, dedicated to unraveling the intricacies of computer-based analysis of human language models.</p>
      </abstract>
      <kwd-group>
        <kwd>word embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The swift evolution of electronic mass media and social networks has spurred the advancement
of automated Natural Language Processing (NLP) systems.
translation, speech recognition, named entity recognition, text classification and summarization,
sentiment analysis, question answering, autocomplete, predictive text input, and more [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ].
      </p>
      <p>Central to NLP is Sentiment Analysis (SA), also known as opinion mining. SA endeavors to
distill subjective attributes from text, such as emotions, sarcasm, confusion, and suspicion.</p>
      <p>The crux of SA revolves around classifying the polarity of a given document, determining
whether the sentiment expressed is positive, negative, or neutral.</p>
      <p>Being a potent text classification technique, sentiment analysis can unveil a wealth of insights
about viewpoints on discussed subjects. It facilitates comprehensive analysis of feedback,
message polarity, and reactions. Notably, SA finds extensive utility among business professionals,
marketers, and politicians.</p>
      <p>
        In dissecting public sentiment regarding sensitive social and political matters, discerning
prevailing themes and tonalities within discussions significantly eases the tasks of sociologists,
political scientists, and journalists [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ].
      </p>
      <p>In the face of ever-mounting information volumes, conventional methodologies have begun
to falter. Swiftly monitoring and controlling public sentiment remains pivotal for success.</p>
      <p>
        Historically, this challenge has been met with dictionary or rule-based approaches [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7, 8, 9, 10</xref>
        ].
These methods are statistical, relying on precompiled sentiment lexicons that pair words with
respective polarities to categorize them as “positive” or “negative”.
      </p>
      <p>However, construction complete dictionaries for a large amount of unstructured data
generated by modern electronic media and social networks are quite a tedious task.</p>
      <p>
        Machine Learning (ML) methods [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11, 12, 13</xref>
        ] help solves this problem. Such approaches are
based on algorithms for classifying words according to the corresponding sentiment marks.
That’s why ML models are preferred for SA due to their ability to processing with the large
amount of texts compared to dictionary-based approaches.
      </p>
      <p>
        Over the past decade, Deep Neural Networks (DNNs) have emerged as formidable tools in
solving numerous NLP challenges, including SA [
        <xref ref-type="bibr" rid="ref14 ref15 ref16">14, 15, 16</xref>
        ]. This surge is underpinned by:
• Progress in crafting diverse DNN architectures (recurrent, convolutional, encoder-decoder,
transformer, hybrid).
• Escalating computational prowess, bolstered by graphics processing units and a profusion
of cloud computing services.
• Availability of labeled datasets tailored to various NLP tasks.
• Emergence of pre-trained word vector representations (word embeddings) like Word2Vec,
      </p>
      <p>
        FastText [
        <xref ref-type="bibr" rid="ref17 ref18 ref19">17, 18, 19</xref>
        ], extending across multiple languages.
      </p>
      <p>
        Recent years have seen the ascendancy of colossal pre-trained models rooted in the
Transformer architecture and the Attention mechanism—think GPT-3, BERT, ELMo [
        <xref ref-type="bibr" rid="ref20 ref21 ref22 ref23">20, 21, 22, 23</xref>
        ].
These models embody language models, encapsulating probability distributions across word
sequences.
      </p>
      <p>These models are all-encompassing, extracting features from text pivotal for solving diverse
text analysis conundrums. However, they come at a computational cost—bearing hundreds of
millions of parameters, necessitating formidable computational resources.</p>
      <p>Hence, for the majority of practical NLP applications, conventional ML and Deep Learning
(DL) methodologies persist as stalwarts.</p>
      <p>Our research aims to architect a suite of sentiment classification models grounded in varied
DNN architectures, scrutinizing their eficacy across the IMDb and Sentiment 140 Twitter
datasets.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        Drus and Khalid [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] provided a report of review on sentiment analysis in social media that
explored the common methods and approaches which used in this domain. This review contains
an analysis of about 30 publications published during 2014-2019 years. According to their results
most of the articles applied opinion-lexicon method to analyses text sentiment in social media
in such domain as world events, healthcare, politics and business.
      </p>
      <p>
        Recently Jain et al. [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] published report on ML applications for consumer sentiment analysis
in the domain of hospitality and tourism. This report based on 68 research papers, which were
focused on sentiment classification, predictive recommendation decisions, and fake reviews
detection.
      </p>
      <p>They have shown a systematic literature review to compare, analyze, explore, and understand
ML possibilities to find research gaps and the future research directions.</p>
      <p>
        Sudhir and Suresh [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] published comparative study of various approaches, applications and
classifiers for sentiment analysis. They have discussed the advantages and disadvantages of
the diferent approaches such as Rule-based, ML and DL approaches used for SA as well as
compared the performances of the classification models on the IMDb dataset.
      </p>
      <p>The authors note that, in general, ML-based approaches provide greater accuracy than
Rulebased ones. At the same time, Conventional ML models (Support Vector Machine, Decision
Trees, and Logistic Regression) provide classification accuracy at the level of 85-87% for the
IMDb dataset. DL-based models (CNN, LSTM, GRU) shows higher accuracy: about 89% on the
IMDb dataset.</p>
      <p>
        Trisna and Jie [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], presented a comparative review of DL approaches for Aspect-Based SA.
The results of their analysis show that the use of pre-trained embeddings is very influential on
the level of accuracy. They also found that every dataset has a diferent method to get better
performance. It is still challenging to find the method that can be flexible and efective for using
in several datasets.
      </p>
      <p>There are several papers devoted to developing new methods of word embeddings.</p>
      <p>
        Thus, Biesialska et al. [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] proposed a novel method which uses contextual embeddings and
a self-attention mechanism to detect and classify sentiment. They performed experiments on
reviews from diferent domains, as well as on languages (Polish and German).
      </p>
      <p>Authors have shown that proposed approach is on a par with state-of-the-art models or even
outperforms them in several cases.</p>
      <p>
        Rasool et al. [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] proposed a novel word embedding method novel word-to-word graph
(W2WG) embedding method for the real-time sentiment for word representation. He noted that
performance evaluation of proposed word embedding approach with integrated LSTM-CNN
outperformed the other techniques and recently available studies for the real-time sentiment
classification.
      </p>
      <p>
        Recently have been published several research papers devoted using DNNs diferent
architecture based on CNN-LSTM models for SA task [
        <xref ref-type="bibr" rid="ref29 ref30 ref31 ref32 ref33">29, 30, 31, 32, 33</xref>
        ].
      </p>
      <p>
        Elzayady et al. [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] presented two powerful hybrid DL models (CNN-LSTM) and
(CNNBILSTM) for reviews classification. Experimental results have shown that the two proposed
models had superior performance compared to baselines DL models (CNN, LSTM).
      </p>
      <p>
        Khan et al. [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] evaluated the performance of various word embeddings for Roman Urdu and
English dialects using the CNN-LSTM architecture and compare results with traditional ML
classifiers. Authors mentioned that BERT word embedding, two-layer LSTM, and SVM as a
classifier function are more suitable options for English language sentiment analysis.
      </p>
      <p>
        Priyadarshini and Cotton [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] proposed a novel LSTM-CNN grid search-based DNN model
for sentiment analysis. As to the experimental results they observed proposed model performed
relatively better than other algorithms (LSTM, Fully-connected NN, K-nearest neighbors, and
CNN-LSTM) on Amazon reviews for sentiment analysis and IMDb datasets.
      </p>
      <p>
        Haque et al. [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] analyzed diferent DNNs for SA on IMDb Movie Reviews. They have
compared between CNN, LSTM and LSTM-CNN architectures for sentiment classification in
order to find the best-suited architecture for this dataset. Experimental results have shown that
CNN has achieved an  1 −   of 91% which has outperformed LSTM, LSTM-CNN and other
state-of-the-art approaches for SA on IMDb dataset.
      </p>
      <p>
        Quraishi [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] evaluated of four ML algorithms (Multinomial Naïve Bayes, Support Vector
Machine, LSTM, and GRU) for sentiment analysis on IMDb review dataset. He found that among
these four algorithms, GRU performed the best with an accuracy of 89.0%.
      </p>
      <p>
        Derbentsev et al. [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] also explored the performance of four ML algorithms (Logistic
Regression, Support Vector Machine, Fully-connected NN, and CNN) for SA on IMDb dataset. They
used two pre-trained word embeddings GloVe and Word2vec with diferent dimensions (100
and 300) as well as TF-IDF representation. They reported that the best classification accuracy
(90.1%) was performed by CNN model with Word2vec-300 embedding.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Base concept of NLP applying to sentiment analysis</title>
      <sec id="sec-3-1">
        <title>3.1. ML approach of NLP</title>
        <p>To solve NLP problems using ML methods, it is necessary to represent the text in the form of
set feature vectors. The text can consist of words, numbers, punctuation, special characters of
additional markup (for example, HTML tags). Each such “unit” can be represented as a vector
in various ways, for example, using unitary codes (one-hot encoding), or context-independent
(depended) vector representations.</p>
        <p>
          The base idea of applying ML to NLP was introduced by Bengio et al. [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ]. They proposed to
jointly learn an “embedding” of words into an n-dimensional numeric vector space and to use
these vectors to predict how likely a word is given its context.
        </p>
        <p>In the case of text, features represent attributes and properties of documents including their
content and meta-attributes, such as document length, author name, source, and publication
date. Together, all document features describe a multidimensional feature space to which ML
methods can be applied.</p>
        <p>Thus, in the most general terms, the application of ML to SA problems consists of the following:
text data preprocessing, feature extraction, classification, and interpretation of results.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data pre-processing</title>
        <p>
          The quality of the result depends on the input data. Therefore, it is important that they are
prepared in the best possible way. In general, pre-processing stage consists of the following
steps [
          <xref ref-type="bibr" rid="ref37 ref38 ref39">37, 38, 39</xref>
          ]:
• Text cleaning. First of all, we need to clean up the text. Depending on the task, cleaning
includes removing non-alphabets, various tags, URLs, punctuation, spaces, and other
markup elements;
• Segmentation and tokenization. They are relevant in the vast majority of cases, and
provide division of the text into separate sentences and words (tokens). As a rule, after
tokenization all words are converted to lower case;
• Lemmatization and stemming. Typically, texts contain diferent grammatical forms of
the same word, and there may also be words with the same root. Lemmatization is the
process of reducing a word form to a lemma – its normal (dictionary) form. Stemming is
a crude heuristic process that cuts of “excess” from the root of words, often resulting in
the loss of derivational sufixes. Lemmatization is a subtler process that uses vocabulary
and morphological analysis to eventually reduce a word to its canonical form, the lemma;
• Definition of context-independent features that characterize each of the token, which not
dependent on adjacent elements;
• Refining significance and applying a filter to stop words . Stop words are frequently used
words that do not add additional information to the text. When we apply ML to texts,
such words can add a lot of noise, so it is necessary to get rid them;
• Dependency parsing. The result is the formation of a tree structure, where the tokens are
assigned to one parent, and the type of relationship is established;
• Converting text content to a vector representation that highlights words used in similar or
identical contexts.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Features extraction</title>
        <p>ML algorithms cannot work directly with raw text, so it is necessary to convert the text into sets
of numbers (vectors) – construct a vector representation. In ML this process is called feature
extraction.</p>
        <p>Vector representation is a general name for various approaches to language modeling and
representation training in NLP aimed at matching words (and possibly phrases) from some
dictionary of vectors.</p>
        <p>
          The most common approaches for construction vector representations are Bag of Words,
TF-IDF, and Word Embeddings [
          <xref ref-type="bibr" rid="ref38">38</xref>
          ].
        </p>
        <sec id="sec-3-3-1">
          <title>3.3.1. Bag of words</title>
          <p>Bag of words (Bow) is a popular and simple feature extraction technique used in NLP. It describes
the occurrences of each word in the text.</p>
          <p>Essentially, it creates a matrix of occurrences for a sentence or document, ignoring grammar
and word order. These frequencies (“occurrences”) of words are then used as features for
learning.</p>
          <p>The basic idea of applying Bow is that similar documents have similar content. Therefore,
basis on content, we can learn something about the meaning of the document.</p>
          <p>For all its simplicity and intuitive clarity, this approach has a significant drawback. The Bow
encoding uses a corpus (or set, collection) of words and represents any given text with a vector
of the length of the corpus. If a word in the corpus is present in the text, the corresponding
element of the vector would be the frequency of the word in the text.</p>
          <p>If individual words are encoded by one-hot vectors, then the feature space will have a
dimension equal to the cardinality of the collection’s dictionary, i.e. tens or even hundreds of
thousands. This dimension rises along with the increasing of the amount of dictionary.
3.3.2. N-grams
Another, more complex way to create a dictionary is to use grouped words. This will resize the
dictionary and give Bow more details about the document.</p>
          <p>This approach is called “N-gram”. An N-gram is a sequence of any entities (words, syllable,
letters, numbers, etc.). In the context of language corpora, an N-gram is usually understood as
a sequence of words.</p>
          <p>A unigram is one word, a be-gram is a sequence of two words, a trigram is three words, and
so on. The number N indicates how many grouped words are included in the N-gram. Not all
possible N-grams get into the model, but only those that appear in the corpus.
3.3.3. TF-IDF
Term Frequency (  ) is the ratio of the number of appearing a certain word to the total number
of words in the document. Thus, the importance of a word  within a single document   is
evaluated:
  (, 
 ) =</p>
          <p>,
∑  
(1)
where   is the number of occurrences of the word  in the document   , and in the denominator
of the fraction is the total number of words in the document.</p>
          <p>But frequency scoring has a problem: words with the highest frequency have, accordingly,
the highest score. There may not be as much information gain for the model in these words as
there is in less frequent words.</p>
          <p>One way to remedy the situation is to downgrade a word that appears frequently in all similar
documents. This metric is called   −   (short for Term Frequency – Inverse Document
Frequency).</p>
          <p>In this metric   is the inverse of the frequency with which a certain word occurs in the
documents of the collection:
  (, 
 , ) = log</p>
          <p>||
|{  ∈ | ∈   }|
.</p>
          <p>Here || is the number of documents in the collection (corpus), {  ∈ | ∈   } is the number
of documents in the collection  that contain word  .</p>
          <p>There is only one   value for each unique word within a given collection of documents.
  metric reduces the weight of commonly corpusused words.</p>
          <p>−   is a statistical measure for estimating the importance of a word in a document that
is part of a collection or corpus:
  -  (, 
 , ) =   (, 
 ) ×   (, 
 , ).</p>
          <p>(2)
(3)
  −   scoring increases in proportion to the frequency of occurrence of the word in the
document, but this is compensated by the number of documents containing this word.</p>
          <p>The disadvantage of the frequency approach based on this metric is that it does not take into
account the context of a single word. Moreover, it does not distinguish the semantic similarity
of words. All vectors are equally far from each other in the feature space.</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>3.3.4. Word embedding</title>
          <p>Word embedding is one of the most popular representations of document’s vocabulary. This is
a technique that maps words into number vectors, where words which have similar meanings
will be close to each other with their vector representation in terms of some distance metric in
the vector space.</p>
          <p>
            Word embedding gives the impressive performance of DL methods on challenging NLP
problem. Recently, several powerful word embedding models have been developed:
• Word2vec (short from Words to Vectors, provided by Google in 2013) [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ];
• GloVe (short from Global Vectors, provided by Stanford University in 2014) [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ];
• FastText (provided by Facebook in 2017) [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ];
• BERT (short from Bidirectional Encoder Representations from Transformers, provide by
          </p>
          <p>
            Google in 2018) [
            <xref ref-type="bibr" rid="ref40">40</xref>
            ].
          </p>
          <p>These models are pre-trained on large corpuses of texts, including Wikipedia and specific
domain.</p>
          <p>Word2vec is a set of ANN models designed to obtain word embedding of natural language
words. It takes a large text corpus as input and maps each word to a vector, producing word
coordinates as output. It first generates a dictionary of the corpus and then calculates a vector
representation of the words by learning from the input texts.</p>
          <p>The vector representation is based on contextual proximity: words that occur in the text next
to the same words (and therefore have a similar meaning) will have close (by cosine distance)
vectors.</p>
          <p>Word2vec implements two main learning algorithms: CBoW (Continuous Bag of Words) and
Skip-gram (figure 1).</p>
          <p>CBoW is an architecture that predicts the current word based on its surrounding context.
Architecture like Skip-gram does the opposite: it uses the current word to predict surrounding
words.</p>
          <p>Building a Word2vec model is possible using these two algorithms. The word order of the
context does not afect the result in any of these algorithms.</p>
          <p>GloVe focuses on words co-occurrences over the whole corpus. Its embeddings relate to the
probabilities that two words appear together. So, GloVe combines features of Word2vec and
singular co-occurrence matrix decomposition.</p>
          <p>In the present study, we applied both Word2vec and GloVe models to obtain vector
representations of words.</p>
          <p>The main application efect of using pre-trained language models is to obtain high-quality
vector representations of words that take into account contextual dependencies and allow you
to achieve better results on targets.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. DNNs classification models design</title>
      <p>After previous stage, we can start building a classification model. The model type and
architecture depends on the research task of SA which can be performed at diferent hierarchical levels
of text documents (document-level, sentence-level, word or aspect-level), domains (reviews
about travel agencies, hotels, movies, election opinion prediction, analysis of public opinion on
acute social and political issues), binary or multiclass classification.</p>
      <p>If we have a dataset of texts with class labels (for example, with binary labels “positive”
and “negative”), we could apply Supervised ML techniques, in particular, binary classification
algorithms.</p>
      <p>Mathematically, this problem can be formulated as follows: given training sample of texts
 = {</p>
      <p>1,  2, ...  }, for each text there is a class label  = {  },   ∈ {0, 1},  = 1, 2, ... .
It is necessary to build a classifier model ( ,  ) ∶  → 
, where  is a vector of unknown
parameters or weights.</p>
      <p>At the same time, it is necessary to minimize the 
function that determines the total
deviation of real class labels from those predicted by the classifier. For binary classification
problems, the most common is binary cross-entropy:

1

=1
 = −</p>
      <p>[∑(  log(  ) + (1 −   ) log(1 −   ))]
where  is the size of the training sample,   = {0, 1} is the true class label for the  -th data
sample,   is the probability of belonging to the positive class for the  -th data sample provided
by the classifier.</p>
      <sec id="sec-4-1">
        <title>4.1. Logistic regression</title>
        <p>Since the task of SA in the general case is reduced to the binary classification problem (negative,
positive), we chose the Logistic Regression (LR) model as the baseline classifier (⋅) :
where ⟨ , ⟩
– denotes the scalar product,  (⋅) is a</p>
        <p>(logistic) function
( ,  ) =</p>
        <p>(⟨ , ⟩ ) ,
 () =</p>
        <p>1
1 + exp(−)
.</p>
        <p>LR has such advantage as it can be used to predict the probability to belong a training sample
(in our case, tokenized and vectorized text) to one of the two target classes.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. CNN model</title>
        <p>
          CNNs are a class of DNNs that were originally designed for image processing [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ]. But these
models have shown their eficiency for many other tasks, such as time series forecasting [
          <xref ref-type="bibr" rid="ref42">42</xref>
          ].
        </p>
        <p>
          Kim [
          <xref ref-type="bibr" rid="ref43">43</xref>
          ] has shown that CNNs are eficient for classifying texts on diferent datasets. Recently,
they have also been used for various NLP tasks (speech generation and recognition, text
summarization, named entity extraction).
        </p>
        <p>The architecture of CNNs consists of convolutional and subsampling layers (figure 2).</p>
        <p>The convolutional layer performs feature extraction from the input data and generates feature
maps. The feature map is computed through an element-wise multiplication of the small matrix
of weights (kernel) and the matrix representation of the input data, and the result is summed.</p>
        <p>This weighted sum then passed through the non-linear activation function. One of the most
common is the function ReLu, which is given as () =
max(0, ) .
(4)
(5)
(6)</p>
        <p>The pooling (subsampling) layer is a non-linear compaction of the feature maps. For example,
max-pooling takes the largest element from the feature map and extracts the sum of all its
elements.</p>
        <p>After max-pooling, feature maps are concatenated into a flatten vector, which will then be
passed to a fully connected layer.</p>
        <p>The input data for the most NLP problems is text which consists of sentences and words. So
we need represent the text as an array of vectors of a certain length: each word mapped to a
specific vector in a vector space composed of the entire vocabulary.</p>
        <p>As these vectors, we can use word frequencies (for example, obtained using the   −  
metric), or pre-trained embeddings (Word2vec, GloVe, FastText).</p>
        <p>Unlike images processing, text convolution is performed using one-dimensional filters (1D
Convolution) on one-dimensional input data, for example, sentences, using convolution kernels
of diferent size (widths).</p>
        <p>Applying of multiple kernels widths and feature maps is analogous to the use of N-grams.</p>
        <p>For image processing, convolutions are usually performed on separate channels that
correspond to the colors of the image: red, green, blue. Set of diferent filters is applied for each
channel, and the result of this operation is then merged into a single vector.</p>
        <p>For text processing as channels we can consider, for example, the sequence of words, or
words embeddings. Then diferent kernels applied to the words can be merged into a single
vector.</p>
        <p>The final result of sentiment analysis is obtained by applying Sigmoid activation function
(binary classification task) or Softmax (in the case of multi-class task).</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. LSTM and BiLSTM model</title>
        <p>Sequential information and long-term dependencies in NLP traditionally performed with
Recurrent Neural Networks (RNNs) which could compute context information, for example, in
dependency parsing.</p>
        <p>
          The most common and eficient for many ML tasks, including NLP, were architectures based
on LSTM (Long Short Term Memory) or GRU (Gated Recurrent Unit) cells [
          <xref ref-type="bibr" rid="ref16 ref37">37, 16</xref>
          ].
4.3.1. LSTM
LSTM model proposed by Hochreiter and Schmidhuber [
          <xref ref-type="bibr" rid="ref44">44</xref>
          ] introduces the concept of a state
for each of the layers of a RNN which plays the role of memory.
        </p>
        <p>The input signal afects the state of the memory, and this, in turn, afects the output layer,
just like in a RNN. But this state of memory persists throughout the time steps of a sequence
(for example, time series, sentence, or text document). Therefore, each input signal afects the
state of the memory as well as the output signal of the hidden layer.</p>
        <p>LSTM cell includes several units or gates: the inputs, output, and forget gates (figure 3). These
gates are used to control a memory cell that is carrying the hidden state ℎ to the next time step.</p>
        <p>The LSTM cell is formally defined as:
  =  ( W ⋅ (ℎ−1 , x ) +   ,
  =  ( W ⋅ (ℎ−1 , x ) +   ,
 ̃ = tanh(W ⋅ (ℎ−1 , x ) +   ),
  =  ( W ⋅ (ℎ−1 , x ) +   ),</p>
        <p>=   ⊗  ̃ ,
  =   ⊗  −1 +   ,
(7)
(8)
(9)
(10)
(11)
(12)
where x – is the vector of input sequence at time  ;  −1 , ℎ−1 – state (long-term content)
and hidden state in previous time step ( − 1 ) respectively;  (⋅) , tanh (⋅) are the  and
tangent activation functions; ⊗ – the Kronecker product; W , W , W

 – the weight
builds a vector  ̃ of new values that can be added to the state of the cell   .
matrices for input, forget, output of the gates respectively;  
,   ,   – biases for the gates.</p>
        <p>The input gate   determines which values need to update. Then the hyperbolic tangent layer
The forget gate   controls how much is remembered (what part of the information is kept
and what is erased) from step to step. Decision what information can be thrown out of the cell
state is made by a sigmoid layer.</p>
        <p>The output gate   receives an input signal (which is the concatenation of the input signal at
time step  and the cell output signal at time step ( − 1) and passes it to the output. Thus, this
gate determines which part of the long-term content   should be transferred to the next time
step.</p>
        <p>Each of these gates is a feed-forward neural network layer consisting of a sequence of
weights fitted by the network with an activation function. This allows the network to learn the
conditions for forgetting, ignoring, or keeping information in the memory cell.</p>
        <p>Due its structure LSTM can learn and remember representations for variable length sequences,
such as sentences, documents, and speech samples.
4.3.2. BiLSTM
Unidirectional (standard) LSTM only preserves information of the past because the only inputs
it has seen are from the past. Unlike standard LSTM, in BiLSTM (Bidirectional LSTM) model
the input flows in both directions and it’s capable of utilizing information from both sides.</p>
        <p>So BiLSTM is a sequence processing model that consists of two LSTMs layers: one taking
the input in a forward direction (from “past to future”), and the other in a backwards direction
(from “future to past”) (figure 4).</p>
        <p>For example, if we want to predict a word by context (the central word), the network takes a
given number of words to the left of it as the context – the Forward layer performs it, as well as
the words to the right of it – Backward layer performs it.</p>
        <p>Then we can combine the outputs from both LSTM layers in diferent ways: as sum, average,
concatenation or multiplication. This output contains the information or relation of past and
future word.</p>
        <p>BiLSTM increase the amount of information available to the network, improving the context.</p>
        <p>It’s also more powerful tool for modeling the sequential dependencies between words and
phrases in both directions of the sequence than standard LSTM.</p>
        <p>BiLSTM is usually used when we have the sequence to sequence tasks but it should be noted
that BiLSTM (compared to LSTM) is a much “slower” model and requires more time for training.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. CNN+LSTM model</title>
        <p>Both basic DNNs architectures CNN and LSTM have own advantages and disadvantages. Thus,
LSTM networks can capture long-term dependencies and find hidden relationships in the data.
CNNs are able to extract features using diferent convolutions and filters.</p>
        <p>
          Therefore, the combination of convolutional and recurrent layers in the model turns out to
be efective in many applied problem such as simulation of various natural processes, image
processing, time series forecasting, and diferent NLP tasks [
          <xref ref-type="bibr" rid="ref28 ref31 ref45 ref46 ref47 ref48">45, 46, 47, 31, 28, 48</xref>
          ].
        </p>
        <p>So we developed two models based on modifications of CNN+LSTM architecture which final
design and hyperparameters settings are given in the Section 6.</p>
        <p>Our proposed models exploit the main features of both LSTM and CNN. In fact, LSTM could
accommodate long-term dependencies and overcome the key issues with vanishing gradients.
For this reason, LSTM is used when longer sequences are used as inputs. On the other hand,
CNN appears able to understand local patterns and position-invariant features of a text.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Datasets and software implementation</title>
      <p>All developed DNNs (CNN, CNN-LSTM, BiLSTM-CNN), and LR as the baseline, were
implemented in the Python 3.8 programming language using Scikit-learn library for LR, estimation
classification accuracy, and for designing DNNs models we used Keras library and TensorFlow
as backend.</p>
      <p>
        We evaluate the performance of our models on two datasets: Stanford’s IMDb dataset
(Stanford’s Large Movie Review Dataset), which contains 50,000 movie reviews as well as Sentiment
140 dataset [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ] with 1.6 million tweets.
      </p>
      <p>Both datasets are intended for binary classification: they contain for each text (review or
tweet) a sentiment class binary label. They are also balanced, i.e. contain the same number of
texts for the positive and negative classes.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Empirical results</title>
      <sec id="sec-6-1">
        <title>6.1. Pre-processing and words embeddings</title>
        <p>
          For text pre-processing the Python library package NLTK [
          <xref ref-type="bibr" rid="ref50">50</xref>
          ] was used, as well as customers
regular expressions.
        </p>
        <p>The pre-processing stage included removing punctuations, markup tags, html and tweet
addresses, removing stop words and converting all words to lower case.</p>
        <p>Tokenization was performed by using Keras preprocessing text library. After tokenization
we got the length of the vocabulary in 92393 unique tokens for IMDb dataset and 507702 for
Sentiment140 respectively to which one token was added for representation out of vocabulary
words.</p>
        <p>It should be noted that the selected datasets are characterized by diferent average length
of texts (number of words). Thus the length of most reviews does not exceed 500 words, and
tweets – 50.</p>
        <p>Since DNNs work with fixed-length input sequences we padded zero tokens all reviews and
tweets which length are less than average to fixed length 500 and 50 words (tokens) respectively,
and cut longer texts to these fixed sizes.</p>
        <p>
          For words vector representation was used GloVe word embeddings with word vectors of
dimension 100 provided by Gensim library [
          <xref ref-type="bibr" rid="ref51">51</xref>
          ].
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. DNNs models design and hyperparameters setting</title>
        <p>To initialize the weights of the first layer (Embedding Layer) for all models, pre-trained GloVe
embeddings of size 100 were used. These weights were frozen and did not change during
training.</p>
        <p>The first model, CNN, consists of three sequential Convolutional layers with filter sets of
diferent kernel widths. These layers are interspersed with Maxpooling layers. Behind them are
a Flatten and a Fully connected (Dense) layer.</p>
        <p>The second, CNN-LSTM model difers from the CNN by the presence of an LSTM layer
instead of a Flatten after Convolutional and Maxpooling. The base idea of such architecture
is that CNN can be used to retrieve higher-level word feature sequences and LSTM to catch
long-term correlations across window feature sequences, respectively.</p>
        <p>The third, BiLSTM-CNN model contains two BiLSTM layers (forward and backward), followed
by a Convolutional and Maxpooling layers. After that, two Fully connected layers were used to
reduce the output dimension and make prediction.</p>
        <p>
          For all models Dropout layers were also used to prevent overfitting. As the Loss-function
Binary Cross-Entropy (4) was chosen, which can be calculated as the average cross-entropy
over all data samples [
          <xref ref-type="bibr" rid="ref52">52</xref>
          ].
        </p>
        <p>The final parameters of DNNs architecture are shown in table 1.</p>
      </sec>
      <sec id="sec-6-3">
        <title>6.3. Evaluating Performance Measures</title>
        <p>The datasets were divided in the proportion of: 64% for training, 20% for validation, and 16% for
test subsets respectively.</p>
        <p>All DNNs models were trained over 5 epochs with a minibatch size of 256 and 1024 samples for
IMDb and Sentiment 140 respectively. To compare classification performance of the developed
models we used the Accuracy metrics given by:
where   and   are the number of correctly predicted values of the positive and negative
classes, respectively;  and  are the actual number of values for each of the classes.</p>
        <p>We also calculated  1 -  which is harmonic average between    (the percentage
of objects in the positive class, which were classified as positive, are correctly classified), and
   =
× 100%,</p>
        <p>(13)
  +  
 + 
(percentage of objects of the true positive class which we correctly classified):
 1 -  =
,
(14)</p>
        <p>Classification performance on IMDb dataset for all developed DNN models is better than
baseline. The best    metric was obtained using the CNN model (90.09%). At the same
time, models based on the combination of Convolutional and LSTM layers showed an   
of 2-3% less (table 2).</p>
        <p>
          It should be noted that obtained results are comparable or even superior in accuracy to the
results given by other researchers [
          <xref ref-type="bibr" rid="ref33 ref34 ref53">33, 34, 53</xref>
          ] for IMDb dataset.
        </p>
        <p>All models showed significantly lower accuracy (on average 10% less) on the dataset Sentiment
140 (table 3). The best result was achieved for the BiLSTM-CNN model –    82.1%.</p>
        <p>At the same time, the complication of models by adding new layers did not lead to a significant
increase in accuracy, but prolonged the training time.</p>
        <p>In our opinion, lower accuracy may be due to the fact that Sentiment 140 dataset contains
many slang words that are out of vocabulary. So, if for IMDb dataset the part of the missing
words was about 30 percent, then for the Sentiment 140 this part was more than 70.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Discussion</title>
      <p>Our research sheds light on the efectiveness of relatively uncomplicated Deep Neural Networks
(DNNs) architectures with a modest layer count for sentiment analysis of social media texts,
particularly within binary classification scenarios. These models exhibit a level of accuracy that
is suficiently practical for real-world applications.</p>
      <p>In the case of the English-language datasets, IMDb and Sentiment 140, our models
showcased the following classification accuracy rates: Logistic Regression (Baseline) achieved 85.9%
(74.23%), CNN achieved 90.09% (77.24%), CNN-LSTM reached 88.01% (78.36%), and BiLSTM-CNN
attained 87.03% (82.10%).</p>
      <p>Notably, preprocessing steps like lemmatization or stemming can likely boost classification
accuracy. This becomes especially relevant for tweets, which frequently feature an array of
user-generated vocabulary.</p>
      <p>Another avenue for potential improvement involves utilizing word embeddings weighted
by their Term Frequency-Inverse Document Frequency (TF-IDF) metric. Addressing
out-ofvocabulary words could involve strategies like employing the weighted average value of
neighboring word embeddings within a designated window length or substituting missing words
with normalized TF-IDF embeddings transformed via principal component analysis (SVD
decomposition of the sparse TF-IDF matrix to reduce dimensionality).</p>
      <p>In our perspective, an exciting trajectory for advancing sentiment analysis in social media
involves the utilization of models rooted in deep convolutional networks or the amalgamation
of convolutional and recurrent networks. Coupling these models with pre-trained embeddings,
such as those founded on GloVe, Word2Vec, and FastText, holds promise. Leveraging pre-trained
embeddings allows the initialization of DNNs with parameters that are already somewhat
attuned to the text classification task, accelerating the learning process and enhancing the
generalization capabilities of classifiers founded on deep networks.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Conclusion</title>
      <p>In conclusion, our research illuminates the eficacy of employing relatively straightforward
Deep Neural Network (DNN) architectures for sentiment analysis in the context of social media
text. Our findings underscore that even DNNs with limited complexity can yield accuracy levels
suitable for practical applications in binary sentiment classification.</p>
      <p>Through experimentation on the IMDb and Sentiment 140 datasets, we observed compelling
classification accuracy results: Logistic Regression (Baseline) achieved 85.9% (74.23%), CNN
achieved 90.09% (77.24%), CNN-LSTM reached 88.01% (78.36%), and BiLSTM-CNN attained
87.03% (82.10%).</p>
      <p>Enhancing the preprocessing steps with techniques like lemmatization and incorporating
weighted word embeddings via TF-IDF are potential strategies to further refine classification
accuracy. Additionally, the combination of deep convolutional and recurrent networks,
complemented by pre-trained embeddings, emerges as a promising avenue for advancing sentiment
analysis in social media. Pre-trained embeddings not only expedite learning but also enhance
the classifier’s ability to generalize.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Derbentsev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Bezkorovainyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Matviychuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Pomazun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hrabariev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hostryk</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis of electronic social media based on deep learning</article-title>
          , in: S.
          <string-name>
            <surname>Semerikov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Soloviev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Matviychuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Kobets</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Kibalnyk</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Danylchuk</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Kiv (Eds.),
          <source>Proceedings of 10th International Conference on Monitoring, Modeling &amp; Management of Emergent Economy - M3E2</source>
          , INSTICC, SciTePress,
          <year>2023</year>
          , pp.
          <fpage>163</fpage>
          -
          <lpage>175</lpage>
          . doi:
          <volume>10</volume>
          .5220/ 0011932300003432.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Azlinah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. W.</given-names>
            <surname>Yap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Zain</surname>
          </string-name>
          , M. W. Berry (Eds.),
          <source>Soft Computing in Data Science 6th International Conference, SCDS</source>
          <year>2021</year>
          , volume
          <volume>1489</volume>
          <source>of SCDS: International Conference on Soft Computing in Data Science</source>
          , Springer, Singapore,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-16-7334-4.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>W.</given-names>
            <surname>Mayur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S. R.</given-names>
            <surname>Annavarapu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chaitanya</surname>
          </string-name>
          ,
          <article-title>A survey on sentiment analysis methods, applications, and challenges</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          <volume>55</volume>
          (
          <year>2022</year>
          )
          <fpage>5731</fpage>
          -
          <lpage>5780</lpage>
          . doi:
          <volume>10</volume>
          . 1007/s10462-022-10144-1.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Silberztein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Atigui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kornyshova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Métais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Meziane</surname>
          </string-name>
          (Eds.),
          <source>Natural Language Processing and Information Systems: Proceedings of 23rd International Conference on Applications of Natural Language to Information Systems</source>
          , volume
          <volume>10859</volume>
          of Lecture Notes in Computer Science, Springer, Cham,
          <year>2018</year>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -91947-8.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Iglesias</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Moreno (Eds.),
          <article-title>Sentiment Analysis for Social Media</article-title>
          ,
          <string-name>
            <surname>MDPI</surname>
          </string-name>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          . 3390/books978-3-
          <fpage>03928</fpage>
          -573-0.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pozzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Fersini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Messina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <source>Sentiment Analysis in Social Networks, Elsevier Science</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Karamollaoğlu</surname>
          </string-name>
          , İ. A.
          <string-name>
            <surname>Doğru</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Dörterler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Utku</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Yıldız</surname>
          </string-name>
          ,
          <article-title>Sentiment Analysis on Turkish Social Media Shares through Lexicon Based Approach</article-title>
          , in: 2018
          <source>3rd International Conference on Computer Science and Engineering (UBMK)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>49</lpage>
          . URL: https: //ieeexplore.ieee.org/document/8566481.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Dhaoui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Webster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. P.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <article-title>Social media sentiment analysis: lexicon versus machine learning</article-title>
          ,
          <source>Journal of Consumer Marketing</source>
          <volume>34</volume>
          (
          <year>2017</year>
          )
          <fpage>480</fpage>
          -
          <lpage>488</lpage>
          . doi:
          <volume>10</volume>
          .1108/ JCM-03-2017-2141.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Khoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Johnkhan</surname>
          </string-name>
          ,
          <article-title>Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons</article-title>
          ,
          <source>Journal of Information Science</source>
          <volume>44</volume>
          (
          <year>2018</year>
          )
          <fpage>491</fpage>
          -
          <lpage>511</lpage>
          . doi:
          <volume>10</volume>
          .1177/ 0165551517703514.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Alessia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ferri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Grifoni</surname>
          </string-name>
          , T. Guzzo,
          <article-title>Approaches, tools and applications for sentiment analysis implementation</article-title>
          ,
          <source>International Journal of Computer Applications</source>
          <volume>125</volume>
          (
          <year>2015</year>
          )
          <fpage>26</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kiv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Semerikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Soloviev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kibalnyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Danylchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Matviychuk</surname>
          </string-name>
          ,
          <article-title>Experimental Economics and Machine Learning for Prediction of Emergent Economy Dynamics</article-title>
          , in: A.
          <string-name>
            <surname>Kiv</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Semerikov</surname>
            ,
            <given-names>V. N.</given-names>
          </string-name>
          <string-name>
            <surname>Soloviev</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Kibalnyk</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Danylchuk</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Matviychuk (Eds.),
          <source>Proceedings of the Selected Papers of the 8th International Conference on Monitoring, Modeling &amp; Management of Emergent Economy, M3E2-EEMLPEED</source>
          <year>2019</year>
          , Odessa, Ukraine, May
          <volume>22</volume>
          -24,
          <year>2019</year>
          , volume
          <volume>2422</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2422</volume>
          /paper00.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Derbentsev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Matviychuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Soloviev</surname>
          </string-name>
          ,
          <article-title>Forecasting of Cryptocurrency Prices Using Machine Learning</article-title>
          , in: L.
          <string-name>
            <surname>Pichl</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Eom</surname>
          </string-name>
          , E. Scalas, T. Kaizoji (Eds.),
          <source>Advanced Studies of Financial Technologies and Cryptocurrency Markets</source>
          , Springer, Singapore,
          <year>2020</year>
          , pp.
          <fpage>211</fpage>
          -
          <lpage>231</lpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-15-4498-9_
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P. V.</given-names>
            <surname>Zahorodko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. O.</given-names>
            <surname>Semerikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Soloviev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Striuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Striuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Shalatska</surname>
          </string-name>
          ,
          <article-title>Comparisons of performance between quantum-enhanced and classical machine learning algorithms on the IBM Quantum Experience</article-title>
          ,
          <source>Journal of Physics: Conference Series</source>
          <year>1840</year>
          (
          <year>2021</year>
          )
          <article-title>012021</article-title>
          . doi:
          <volume>10</volume>
          .1088/
          <fpage>1742</fpage>
          -
          <lpage>6596</lpage>
          /
          <year>1840</year>
          /1/012021.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Deep learning for natural language processing: advantages and challenges</article-title>
          ,
          <source>National Science Review</source>
          <volume>5</volume>
          (
          <year>2017</year>
          )
          <fpage>24</fpage>
          -
          <lpage>26</lpage>
          . doi:
          <volume>10</volume>
          .1093/nsr/nwx110.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>K. W.</given-names>
            <surname>Trisna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. J.</given-names>
            <surname>Jie</surname>
          </string-name>
          ,
          <article-title>Deep Learning Approach for Aspect-Based Sentiment Classification: A Comparative Review</article-title>
          ,
          <source>Applied Artificial Intelligence</source>
          <volume>36</volume>
          (
          <year>2022</year>
          )
          <article-title>2014186</article-title>
          . doi:
          <volume>10</volume>
          .1080/ 08839514.
          <year>2021</year>
          .
          <volume>2014186</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>U.</given-names>
            <surname>Kamath</surname>
          </string-name>
          , J. Liu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Whitaker</surname>
          </string-name>
          ,
          <source>Deep Learning for NLP and Speech Recognition</source>
          , Springer, Cham,
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -14596-5.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Eficient Estimation of Word Representations in Vector Space</article-title>
          , in: Y. Bengio, Y. LeCun (Eds.),
          <source>1st International Conference on Learning Representations, ICLR</source>
          <year>2013</year>
          , Scottsdale, Arizona, USA, May 2-
          <issue>4</issue>
          ,
          <year>2013</year>
          , Workshop Track Proceedings,
          <year>2013</year>
          . URL: https://arxiv.org/abs/1301.3781.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          , C. Manning,
          <article-title>GloVe: Global Vectors for Word Representation</article-title>
          ,
          <source>in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Doha, Qatar,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          . doi:
          <volume>10</volume>
          .3115/v1/
          <fpage>D14</fpage>
          -1162.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Enriching Word Vectors with Subword Information, Transactions of the Association for Computational Linguistics 5 (</article-title>
          <year>2017</year>
          )
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
          . doi:
          <volume>10</volume>
          .1162/tacl_a_
          <fpage>00051</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>A. K. Durairaj</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Chinnalagu</surname>
          </string-name>
          ,
          <article-title>Transformer based Contextual Model for Sentiment Analysis of Customer Reviews:</article-title>
          A
          <string-name>
            <surname>Fine-tuned</surname>
            <given-names>BERT</given-names>
          </string-name>
          ,
          <source>International Journal of Advanced Computer Science and Applications</source>
          <volume>12</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .14569/IJACSA.
          <year>2021</year>
          .
          <volume>0121153</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Geetha</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Karthika Renuka, Improving the performance of aspect based sentiment analysis using fine-tuned Bert Base Uncased model</article-title>
          ,
          <source>International Journal of Intelligent Networks</source>
          <volume>2</volume>
          (
          <year>2021</year>
          )
          <fpage>64</fpage>
          -
          <lpage>69</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.ijin.
          <year>2021</year>
          .
          <volume>06</volume>
          .005.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>H.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ergu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <article-title>Text sentiment analysis of fusion model based on attention mechanism</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>199</volume>
          (
          <year>2022</year>
          )
          <fpage>741</fpage>
          -
          <lpage>748</lpage>
          . doi:
          <volume>10</volume>
          .1016/j. procs.
          <year>2022</year>
          .
          <volume>01</volume>
          .092.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tabinda Kokab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Asghar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Naz</surname>
          </string-name>
          ,
          <article-title>Transformer-based deep learning models for the sentiment analysis of social media data</article-title>
          ,
          <source>Array</source>
          <volume>14</volume>
          (
          <year>2022</year>
          )
          <article-title>100157</article-title>
          . doi:https://doi.org/ 10.1016/j.array.
          <year>2022</year>
          .
          <volume>100157</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Drus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Khalid</surname>
          </string-name>
          ,
          <article-title>Sentiment Analysis in Social Media and Its Application: Systematic Literature Review</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>161</volume>
          (
          <year>2019</year>
          )
          <fpage>707</fpage>
          -
          <lpage>714</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.procs.
          <year>2019</year>
          .
          <volume>11</volume>
          .174.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pamula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <article-title>A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews</article-title>
          ,
          <source>Computer Science Review</source>
          <volume>41</volume>
          (
          <year>2021</year>
          )
          <article-title>100413</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.cosrev.
          <year>2021</year>
          .
          <volume>100413</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>P.</given-names>
            <surname>Sudhir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. D.</given-names>
            <surname>Suresh</surname>
          </string-name>
          ,
          <article-title>Comparative study of various approaches, applications and classifiers for sentiment analysis</article-title>
          ,
          <source>Global Transitions Proceedings</source>
          <volume>2</volume>
          (
          <year>2021</year>
          )
          <fpage>205</fpage>
          -
          <lpage>211</lpage>
          . doi:https://doi.org/10.1016/j.gltp.
          <year>2021</year>
          .
          <volume>08</volume>
          .004.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>M.</given-names>
            <surname>Biesialska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Biesialska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Rybinski</surname>
          </string-name>
          ,
          <article-title>Leveraging contextual embeddings and selfattention neural networks with bi-attention for sentiment analysis</article-title>
          ,
          <source>Journal of Intelligent Information Systems</source>
          <volume>57</volume>
          (
          <year>2021</year>
          )
          <fpage>601</fpage>
          -
          <lpage>626</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10844-021-00664-7.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rasool</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Ji, WRS: A Novel Word-embedding Method for Real-time Sentiment with Integrated LSTM-CNN Model</article-title>
          , in: 2021
          <source>IEEE International Conference on Real-time Computing and Robotics (RCAR)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>590</fpage>
          -
          <lpage>595</lpage>
          . doi:
          <volume>10</volume>
          .1109/RCAR52367.
          <year>2021</year>
          .
          <volume>9517671</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>H.</given-names>
            <surname>Elzayady</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Badran</surname>
          </string-name>
          ,
          <article-title>Integrated bidirectional LSTM-CNN model for customers reviews classification</article-title>
          ,
          <source>Journal of Engineering Science and Military Technologies</source>
          <volume>5</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .21608/EJMTC.
          <year>2021</year>
          .
          <volume>66626</volume>
          .1172.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hernández</surname>
          </string-name>
          , I. Batyrshin,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Sidorov, Evaluation of deep learning models for sentiment analysis</article-title>
          ,
          <source>Journal of Intelligent &amp; Fuzzy Systems</source>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          . doi:
          <volume>10</volume>
          .3233/JIFS-211909.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>L.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Amjad</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Afaq</surname>
          </string-name>
          , H.-T. Chang,
          <article-title>Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>12</volume>
          (
          <year>2022</year>
          )
          <article-title>2694</article-title>
          . doi:
          <volume>10</volume>
          .3390/app12052694.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>I.</given-names>
            <surname>Priyadarshini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cotton</surname>
          </string-name>
          ,
          <article-title>A novel LSTM-CNN-grid search-based deep neural network for sentiment analysis</article-title>
          ,
          <source>The Journal of Supercomputing</source>
          <volume>77</volume>
          (
          <year>2021</year>
          )
          <fpage>13911</fpage>
          -
          <lpage>13932</lpage>
          . doi:
          <volume>10</volume>
          . 1007/s11227-021-03838-w.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Haque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Salma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Lima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Zaman</surname>
          </string-name>
          ,
          <source>Performance Analysis of Diferent Neural Networks for Sentiment Analysis on IMDb Movie Reviews</source>
          ,
          <year>2020</year>
          . URL: https: //www.researchgate.net/publication/343046458.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>A. H.</given-names>
            <surname>Quraishi</surname>
          </string-name>
          ,
          <article-title>Performance Analysis of Machine Learning Algorithms for Movie Review</article-title>
          ,
          <source>International Journal of Computer Applications</source>
          <volume>177</volume>
          (
          <year>2020</year>
          )
          <fpage>7</fpage>
          -
          <lpage>10</lpage>
          . doi:
          <volume>10</volume>
          .5120/ ijca2020919839.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>V.</given-names>
            <surname>Derbentsev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Bezkorovainyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Akhmedov</surname>
          </string-name>
          ,
          <source>Machine Learning Approach of Analysis Emotion Polarity Electronic Social Media, Neiro-Nechitki Tekhnolohii Modelyuvannya v Ekonomitsi</source>
          <volume>9</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ducharme</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vincent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jauvin</surname>
          </string-name>
          ,
          <article-title>A neural probabilistic language model</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>3</volume>
          (
          <year>2003</year>
          )
          <fpage>1137</fpage>
          -
          <lpage>1155</lpage>
          . URL: https://proceedings.neurips. cc/paper/2000/file/728f206c2a01bf572b5940d7d9a8fa4c-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>J.</given-names>
            <surname>Brownlee</surname>
          </string-name>
          ,
          <article-title>Develop Deep Learning Models for Natural Language in Python</article-title>
          .
          <source>Deep Learning for Natural Language Processing</source>
          ,
          <year>2017</year>
          . URL: http://ling.snu.ac.kr/class/AI_
          <article-title>Agent/ deep_learning_for_nlp</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>L.</given-names>
            <surname>Hobson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cole</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hannes</surname>
          </string-name>
          , Natural Language Processing in Action: Understanding, analyzing, and
          <article-title>generating text with Python</article-title>
          ,
          <source>Manning Publications</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>J.</given-names>
            <surname>Camacho-Collados</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Pilehvar</surname>
          </string-name>
          ,
          <article-title>On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation Study on Text Categorization and Sentiment Analysis</article-title>
          ,
          <source>in: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Association for Computational Linguistics</source>
          , Brussels, Belgium,
          <year>2018</year>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>46</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W18</fpage>
          -5406.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          ,
          <year>2018</year>
          . URL: https://arxiv.org/abs/
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          , Y. Bengio,
          <article-title>Convolutional Networks for Images, Speech, and Time Series, in: The Handbook of Brain Theory and Neural Networks</article-title>
          , MIT Press, Cambridge, MA, USA,
          <year>1998</year>
          , p.
          <fpage>255</fpage>
          -
          <lpage>258</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          , Y. Bengio, G. Hinton,
          <article-title>Deep learning</article-title>
          ,
          <source>Nature</source>
          <volume>521</volume>
          (
          <year>2015</year>
          )
          <fpage>436</fpage>
          -
          <lpage>444</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>Convolutional Neural Networks for Sentence Classification</article-title>
          , in: A.
          <string-name>
            <surname>Moschitti</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Pang</surname>
          </string-name>
          , W. Daelemans (Eds.),
          <source>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29</source>
          ,
          <year>2014</year>
          , Doha,
          <string-name>
            <surname>Qatar,</surname>
          </string-name>
          <article-title>A meeting of SIGDAT, a Special Interest Group of the ACL</article-title>
          , ACL,
          <year>2014</year>
          , pp.
          <fpage>1746</fpage>
          -
          <lpage>1751</lpage>
          . doi:
          <volume>10</volume>
          .3115/ v1/d14-
          <fpage>1181</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Long</given-names>
            <surname>Short-Term</surname>
          </string-name>
          <string-name>
            <surname>Memory</surname>
          </string-name>
          ,
          <source>Neural Computation</source>
          <volume>9</volume>
          (
          <year>1997</year>
          )
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          . doi:
          <volume>10</volume>
          .1162/neco.
          <year>1997</year>
          .
          <volume>9</volume>
          .8.1735.
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>N.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Advanced Combined LSTM-CNN Model for Twitter Sentiment Analysis</article-title>
          ,
          <source>in: 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>684</fpage>
          -
          <lpage>687</lpage>
          . doi:
          <volume>10</volume>
          .1109/CCIS.
          <year>2018</year>
          .
          <volume>8691381</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>V.</given-names>
            <surname>Derbentsev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Bezkorovainyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Silchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hrabariev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Pomazun</surname>
          </string-name>
          ,
          <article-title>Deep Learning Approach for Short-Term Forecasting Trend Movement of Stock Indeces</article-title>
          ,
          <source>in: 2021 IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (PIC S&amp;T)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>607</fpage>
          -
          <lpage>612</lpage>
          . doi:
          <volume>10</volume>
          .1109/PICST54195.
          <year>2021</year>
          .
          <volume>9772235</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <surname>M. Z. Islam</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Islam</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Asraf</surname>
          </string-name>
          ,
          <article-title>A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images</article-title>
          ,
          <source>Informatics in Medicine Unlocked</source>
          <volume>20</volume>
          (
          <year>2020</year>
          )
          <article-title>100412</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.imu.
          <year>2020</year>
          .
          <volume>100412</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>L.</given-names>
            <surname>Shang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis of film reviews based on CNNBLSTM-attention</article-title>
          ,
          <source>Journal of Physics: Conference Series</source>
          <volume>1550</volume>
          (
          <year>2020</year>
          )
          <article-title>032056</article-title>
          . doi:
          <volume>10</volume>
          .1088/
          <fpage>1742</fpage>
          -6596/1550/3/032056.
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <surname>Kaggle</surname>
          </string-name>
          ,
          <article-title>Sentiment140 dataset with 1.6 million tweets</article-title>
          ,
          <year>2022</year>
          . URL: https://www.kaggle.com/ datasets/kazanova/sentiment140.
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>NLTK</given-names>
            <surname>Project</surname>
          </string-name>
          ,
          <source>Natural language toolkit</source>
          ,
          <year>2022</year>
          . URL: https://www.nltk.org/.
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>R.</given-names>
            <surname>Řehůřek</surname>
          </string-name>
          , Gensim: Topic modelling for humans,
          <year>2022</year>
          . URL: https://radimrehurek.com/ gensim/.
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [52]
          <string-name>
            <given-names>A.</given-names>
            <surname>Geron</surname>
          </string-name>
          ,
          <article-title>Hands-On Machine Learning with Scikit-Learn and</article-title>
          <string-name>
            <surname>TensorFlow</surname>
          </string-name>
          ,
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          , Inc.,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          [53]
          <string-name>
            <given-names>N. M.</given-names>
            <surname>Ali</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. A. El Hamid</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Youssif</surname>
          </string-name>
          ,
          <article-title>Sentiment Analysis for Movies Reviews Dataset Using Deep Learning Models</article-title>
          ,
          <source>International Journal of Data Mining &amp; Knowledge Management Process (IJDKP) 9</source>
          (
          <year>2019</year>
          ). URL: https://aircconline.com/abstract/ijdkp/v9n3/ 9319ijdkp02.html.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>