<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>K. Bhuvaneshwari, D. S. A. J. Rani, D. V. V. Haragopal, Sentiment Analysis of Tweets on
Telangana State Government Flagship Schemes, Int. J. Eng. Adv. Technol.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1049/cit2.12037</article-id>
      <title-group>
        <article-title>Vita Kashtan†, Volodymyr Hnatushenko∗,†, Maksym Ovcharenko†, Artem Ivanko†</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Dnipro University of Technology</institution>
          ,
          <addr-line>Dmytra Yavornytskoho Ave 19, Dnipro, 49005</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <volume>1</volume>
      <issue>2022</issue>
      <fpage>23</fpage>
      <lpage>27</lpage>
      <abstract>
        <p>The paper proposes a hybrid neural network architecture for multi-class classification of the emotional state of social media texts, which combines contextual vector representations generated by the BERT model with extended structured features related to the user and message metadata. The proposed approach uses a multilayer perceptron network that combines linguistic and contextual information in a single representation. The results of the experimental study confirm the superiority of the proposed approach over traditional LSTM and CNN architectures, as well as over the separate use of BERT embeddings. The achieved classification accuracy is 90%, and the F1-measure is 0.91, which indicates the high efficiency of the model in conditions of high variability of language structures and stylistic features of social content.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;emotion state</kwd>
        <kwd>contextual analysis</kwd>
        <kwd>semantic analysis</kwd>
        <kwd>deep learning</kwd>
        <kwd>neural network architecture1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The current development period in the digital society is characterized by the rapid growth of data
volumes resulting from user activity on social media platforms such as Twitter, Facebook, and
other microblogging services [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. These platforms have become key channels for the spontaneous
expression of opinions, emotions, and real-time reactions to events. Despite their brevity and
informal style, messages posted on social networks often reflect deep aspects of users'
psychoemotional states, making them a valuable source for public opinion analysis, consumer behavior
prediction, and studying social processes [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>However, such messages' short, non-standard, and often context-rich nature poses significant
challenges for traditional natural language processing methods. Conventional linguistic models,
which primarily focus on syntactic and semantic analysis, usually fail to adequately interpret the
emotional nuances of expressions, especially without considering the interaction context, user
social activity, and platform-specific characteristics. In the business environment, similar
difficulties arise when analyzing large volumes of voice and textual data received by contact
centers, where service quality increasingly depends on a deep understanding of the emotional
dynamics of communication.</p>
      <p>
        Sentiment analysis, or emotional text classification, is one of the key tasks in Natural Language
Processing, aimed at automatically identifying the emotional tone of an utterance that reflects the
author's attitude toward an object, event, or phenomenon [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In this context, the central concept is
the opinion holder, the subject expressing a viewpoint. It may be an individual, a group, an
organization, or a collective entity. A distinction is made between direct emotional expression,
which explicitly relates to the object, and indirect emotion, which stems from the evaluation of
events or consequences associated with it [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>Most existing sentiment analysis models applied to social media content focus exclusively on
the linguistic content of messages, overlooking user metadata, the topic of discussion, or the
specific nature of the platform. This limits their ability to accurately detect emotions, especially
under conditions of high linguistic variability, the use of slang, emojis, and multimodal elements.</p>
      <p>Thus, the study aims to overcome the limitations of existing emotion analysis methods and
develop a comprehensive approach that ensures higher accuracy and adaptability under real-world
conditions in social media and business communication. The architecture proposed in this work
opens up new prospects for the automation of emotional content evaluation and for enhancing the
effectiveness of interaction in the digital environment.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        Over the past decades, numerous methods for automated emotion analysis have been proposed,
among which three main approaches dominate: rule- and lexicon-based methods, machine learning
techniques, and deep learning models. The most common classifiers are the Naive Bayes classifier
and the Support Vector Machine (SVM) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. These models demonstrate effectiveness on
wellstructured textual datasets where the emotional tone of expressions is clear and easily recognizable
(e.g., distinctly positive or negative); however, they perform poorly when adapting to new thematic
domains or stylistically different content. As noted by Paltoglou and Giachanou [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the primary
limitation of these approaches is their high sensitivity to the subject domain, which leads to a loss
of generalization capability when the context changes. This issue arises from the difficulty of
creating sufficiently comprehensive and representative training datasets for specific tasks.
      </p>
      <p>
        Lexicon-based methods rely on affective lexicons containing words assigned with positive or
negative values. Well-known examples of such lexicons include General Inquirer [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
SentiWordNet [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and WordNet-Affect [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Some studies [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] demonstrate the effectiveness of
lexicon-based analysis in social media contexts, especially when enhanced with syntactic rules
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. However, the success of these methods heavily depends on the lexicon's completeness, the
vocabulary's domain specificity, and their inability to adequately reflect contextual changes in
meaning [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Furthermore, lexicons do not cover the dynamic vocabulary of social media,
including slang, emojis, and abbreviations.
      </p>
      <p>
        Over the past decade, deep learning (DL), implemented through multilayer neural networks, has
become a key direction in the evolution of emotion analysis methods. Due to its ability to
automatically extract high-level abstract features from unstructured data, DL has demonstrated
high effectiveness in complex tasks, particularly in computer vision, satellite image processing,
object recognition, speech processing, and text analysis [
        <xref ref-type="bibr" rid="ref13 ref14 ref15 ref16">13, 14, 15, 16</xref>
        ]. One example of the
successful application of deep learning in sentiment analysis on Twitter is the Sentiment-Specific
Word Embedding approach proposed in [17]. In this approach, distance learning based on
prelabeled tweets was used to train vector representations of words. Further research indicates that
traditional embedding methods do not accurately reflect the sentiment information of rarely used
words. To this end, the Bayesian Estimation-based Sentiment Word Embedding model was
proposed, which provides more accurate extraction of emotional information from such words
using Bayesian estimation and a special loss function. It has significantly improved the quality of
embedding and the overall accuracy of sentiment analysis [18]. The fusion of unstructured data
from various sources is a significant area of contemporary research in data processing and artificial
intelligence [19]. By combining information from textual, visual, audio, and other types of
unstructured data, deep learning-based systems can generate more comprehensive and context-rich
representations, thereby enhancing recognition and classification accuracy in complex tasks [20].
      </p>
      <p>Further development in this field is associated with the use of convolutional neural networks
(CNN) for phrase and document analysis [21], as well as recurrent architectures, notably Long
Short-Term Memory (LSTM) networks [22]. In [22], a hierarchical LSTM model was proposed that
considers context at the individual tweet and message stream levels. Additionally, several types of
context (social, topical, conversational) were adapted and represented as binary features. Results
obtained on a corpus of 15,000 tweets annotated with three classes (positive, negative, neutral)
demonstrated a significant improvement in accuracy compared to baseline models (SVM,
LSTMRNN, CNN). The study proposes using hybrid deep learning models with textual representations
generated by BERT to analyze reviews from Indonesian e-commerce platforms. Comparison of the
results showed that models with BERT representation outperform models with classical
embeddings [23, 24].</p>
      <p>To overcome the limitations of the approaches above, hybrid methods [25] have been proposed
that combine lexical analysis with machine learning. The linguistic analysis stage typically
generates input features for subsequent classification, and the results are used for iterative
dictionary expansion and model improvement [26]. Despite this, hybrid approaches remain limited
in their ability to model the' complex semantic and pragmatic structure of utterances.</p>
      <p>Despite the active development of approaches for the automated analysis of emotional states in
texts, particularly in social media, modern methods still exhibit several significant limitations.
Traditional approaches display semantic rigidity, which hinders the accurate recognition of
emotions in messages containing a high degree of informal, contextually variable vocabulary
characteristic of digital communication. Moreover, such models typically have limited adaptability:
algorithms trained in one domain tend to perform poorly when transferred to other subject areas
or topics. One of the key challenges remains the neglect of social context and metadata: most
approaches focus exclusively on textual content, overlooking critical characteristics such as writing
style, temporal dynamics of messages, user activity, follower count, or the author’s influence
within the network. Integrating multidimensional features within a unified model also poses a
challenge; the growing volume of available structured information necessitates architectures
capable of effectively combining linguistic, behavioral, and contextual attributes.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Problem statement</title>
      <p>Although transformer-based models, particularly BERT, demonstrate high effectiveness in text
classification due to their ability to capture sentence-level contextual information, they are not
inherently designed to process additional structured features such as social context or user profile
characteristics without specialized architectural modifications. Therefore, developing a hybrid
architecture integrating high-level context-aware language representations generated by BERT
with multidimensional information about the user or the publication environment remains
relevant. Such an approach can significantly improve the accuracy of emotional state classification
of messages in dynamic information environments.</p>
      <p>This study aims to design and experimentally evaluate a hybrid neural network architecture
that combines contextual vector representations of text obtained via the BERT model with
extended user-related features. It is achieved by integrating them into a multi-level architecture
based on a multilayer perceptron (MLP) to enhance the accuracy of semantic and contextual
sentiment analysis in social media content.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Proposed approach</title>
      <p>The proposed architecture of hybrid semantic-contextual analysis of emotions in social content,
shown in Figure 1, consists of four key steps: data collection and preprocessing, extraction of
extended features, construction of a hybrid model, and evaluation of its effectiveness.</p>
      <p>(e.g., timestamp, number of retweets and likes), and user-related
information</p>
      <p>∈ ℝ (e.g., number of followers, posting history, etc.) are extracted. All collected
components form an extended input dataset:</p>
      <p>= {( ,  ,  )| = 1, …  }.</p>
      <p>The second step involves text pre-processing to convert raw content into a format suitable for
analysis. The process includes tokenization, breaking the text into linguistic units, normalizing
(lowercase translation, error correction), and removing noise elements: stop words, URLs, hashtags,
and user mentions. At the same time, emojis and hashtags are not entirely removed, but
transformed into special representations, as they carry significant emotional and contextual
meaning in social media. This results in a cleaned and normalized text corpus.</p>
      <p>→ {
, 
, …</p>
      <p>},

:  →</p>
      <p>, ∈ ℝ ,
where  is the text;; wij is the j-th token in message  ,   is the length of the tokenized text  .</p>
      <p>The third step focuses on extracting advanced features that reflect various aspects of the
messages and their authors. Lexical features include word frequency, n-gram sequences, number of
words and punctuation marks, and the presence of slang or foul language.
(1)
(2)
(3)
where d1 is the dimensionality of lexical features.</p>
      <p>Sentiment features are based on external affective dictionaries and pre-trained models that
assess the polarity and subjectivity of the text, as well as take into account emotion enhancers or
weakeners. User metadata (number of followers, publication history, verification status) and
message characteristics (time of publication, activity in the form of retweets and likes, geolocation)
are also used to generate contextual features. After extraction, all features are calculated and
combined into a feature vector for each message, forming a complete dataset further divided into
training and test sets.</p>
      <p>, = [ ,  ] ∈ ℝ .
(4)</p>
      <p>The main component of the proposed architecture is a hybrid model that combines powerful
contextual representations generated by the transformational BERT [23, 24] model with traditional
and user-generated features. The cleaned and normalized text is fed to the pre-trained BERT model,
which produces contextual vector embeddings for each message that consider both the semantic
and syntactic context of the text. The resulting BERT vectors are concatenated with the extended
feature vector generated in the previous step. The combined vector serves as input to a multilayer
perceptron network that learns to classify the emotional state of a message into predefined
categories (positive, negative, neutral, and more detailed emotions such as joy, sadness, anger, etc.)
MLP can model complex nonlinear relationships between input features and target classes.</p>
      <p>The final stage involves evaluating the model's performance using classical classification
metrics: F1-score, Precision, and Recall on a test dataset. This hybrid approach, which integrates
deep contextual representations of text with multidimensional features, is expected to provide a
more accurate and comprehensive semantic and contextual analysis of emotions in social content
compared to methods based solely on text or individual feature sets.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Experiment</title>
      <p>A textual dataset collected from the Twitter social media platform experimentally validated the
proposed hybrid neural network architecture. The choice of Twitter is motivated by its popularity
as a medium for expressing spontaneous opinions and emotions, as well as the fact that tweets,
though limited to 280 characters, often contain sufficiently meaningful information for public
opinion analysis.</p>
      <p>The primary test dataset consists of approximately 80,000 textual messages related to the global
COVID-19 pandemic. These data were collected over three months, from March 2020 to May 2020
[27], covering the phase of active virus spread and public discourse around related events.
Hashtags, an integral part of Twitter content, were used not only as markers for sentiment
classification but also as an effective means for filtering and selecting relevant data, ensuring the
thematic consistency of the dataset.</p>
      <p>Analysis of this dataset reveals key statistical patterns in the characteristics of textual messages
that may significantly influence their emotional tone. In particular, based on the histograms of key
linguistic and structural features (Figure 2), the following characteristics are observed:
1. The number of words and text length indicate that most tweets are short, condensing
information within a limited character space.</p>
      <p>2. The number of unique words reflects the lexical diversity used in the messages.
3. The number of stop words captures the frequency of commonly used words, which are
traditionally removed during text preprocessing but may carry indicative value in specific models.</p>
      <p>4. Average word length provides insight into the level of formality or informality of the
language.
intensity.
part of the message context.</p>
      <p>5. The number of punctuation marks, especially exclamation marks, may indicate emotional
6. The number of URL links points to the use of external information sources, which may form
7. The number of hashtags reflects the authors' intent to categorize messages, highlight topics,
or associate with specific communities.</p>
      <p>8. The number of users mentioned indicates the level of social interaction, an essential context
component.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Results and Discussion</title>
      <p>Experimental results were conducted to evaluate the effectiveness of the proposed hybrid neural
network architecture by comparing it with several baseline models. The results are presented in
tables with summarized comparative metrics, including Accuracy, F1-score, Precision, and Recall
(in Macro Average format) [28], and a detailed classification report for the model that
demonstrated the highest efficiency.</p>
      <p>The results of the comparative analysis of neural networks presented in Table 1 demonstrate a
significant improvement in the performance of the proposed hybrid architecture compared to the
baseline models. Traditional LSTM and CNN models with classical tokenization showed average
Precision and F1-score values in the range of 0.72-0.75, indicating a moderate classification
accuracy level. The BERT model in the standalone version significantly exceeded these results,
reaching a Precision of 0.83 and an F1-score of 0.80, due to its ability to take into account the
contextual representation of the text.</p>
      <p>Integration of BERT embeddings into traditional LSTM and CNN also improved the results. Still,
the proposed approach, which combines BERT contextual representations with multidimensional
user features in a hybrid architecture, achieved the highest values of all metrics: Precision - 0.94,
F1-score - 0.91, Precision (Macro Avg) - 0.90, and Recall (Macro Avg) - 0.91. It indicates a
significant increase in the accuracy, balance, and completeness of the classification of the
emotional state of messages compared to other models, confirming the proposed methodology's
effectiveness.</p>
      <p>A detailed classification report for the BERT+MLP model with five-class categorization is shown
in Table 2.</p>
      <p>The Precision, Recall, and F1-score metrics for most classes demonstrate values above 0.70,
which indicates that the model is balanced and stable. In particular, the “Extremely Negative” and
“Neutral” classes are characterized by the highest scores, with a Precision of 0.96 and 0.91,
respectively, indicating accurate recognition of strongly negative and neutral messages. Although
the “Negative”, ‘Positive’ and “Extremely Positive” classes show slightly lower accuracy or
completeness, the overall classification accuracy rate is 90%, and the average macro F1-score
reaches 0.91. It demonstrates that the model can effectively balance between all classes, given their
unevenness in the training set. The weighted average also confirms the stability and reliability of
the model when working with real data, which is essential for practical applications in analyzing
emotional states of social content.</p>
      <p>Confusion matrix analysis is a key step in evaluating the performance of multiclass emotion
classification models, as it allows for a detailed examination of classification accuracy for each class
and helps identify specific types of errors, including false positives and false negatives (Fig. 4).
Figure 4: Confusion matrix analysis.</p>
      <p>The traditional LSTM model using standard tokenization shows a limited ability to recognize
emotional classes accurately. The highest classification accuracy is observed for the classes
“Negative” (100 instances), “Extremely Negative” (58 instances), and “Positive” (53 instances), while
the “Neutral” (50 instances) and especially “Extremely Positive” (23 instances) classes show
significantly lower accuracy. A considerable amount of confusion is observed between neighboring
categories. For example, “Negative” is often misclassified as either “Extremely Negative” or
“Neutral.” Similarly, the “Extremely Positive” class is frequently confused with both “Positive” and
“Negative,” indicating the model’s insufficient ability to distinguish between levels of emotional
intensity. The CNN model with traditional tokenization demonstrates performance similar to
LSTM, with slight improvements in the classification of extreme classes (“Extremely Negative” – 53
instances, “Extremely Positive” – 55 cases). However, classification errors are still concentrated
around confusion between “Negative” and both “Extremely Negative” and “Positive,” as well as
difficulties in accurately identifying “Neutral” and “Positive,” which limits the model’s ability to
delineate the emotional spectrum.</p>
      <p>The BERT model, operating with three aggregated classes (“Negative,” “Neutral,” and
“Positive”), demonstrates a significant performance improvement compared to traditional
approaches, particularly for the “Negative” class (232 correct predictions). However, even under
these conditions, classification errors, especially the confusion between “Neutral” and either
“Negative” or “Positive,” as well as misclassifications between “Positive” and “Negative,” indicate
the model’s difficulty in clearly distinguishing between closely related emotional categories.
Integrating BERT embeddings as input features for the LSTM model results in a moderate
improvement over the traditional LSTM. There is an increase in the number of correct predictions
for the “Negative” (97 instances) and “Extremely Negative” (61 cases) classes. However, the
“Neutral,” “Positive,” and “Extremely Positive” classes still exhibit significant confusion, suggesting
that simple input of embeddings without more sophisticated integration mechanisms is
insufficient. Similar trends are observed with CNN using BERT embeddings. Moderate
improvements are achieved for “Negative” (128 instances), “Neutral” (63 instances), and “Extremely
Positive” (57 instances). However, “Positive” and “Neutral” continue to be confused with each
other, and with “Negative,” the overall number of classification errors remains noticeable. Finally,
the confusion matrix of the proposed hybrid architecture (BERT + MLP) demonstrates substantially
higher accuracy and a markedly reduced number of misclassifications across all five emotional
categories. It is characterized by a high number of correct predictions in each class, including
“Extremely Negative” (106 instances), “Negative” (185), “Neutral” (114), “Positive” (133), and
“Extremely Positive” (105), along with a minimal level of confusion. It indicates the high
effectiveness of integrating contextual BERT embeddings with enriched features via a multilayer
perceptron, enabling more accurate and nuanced differentiation of emotional states.</p>
      <p>Thus, the confusion matrix analysis confirms that traditional LSTM and CNN architectures,
especially without contextual representations, have significant limitations in multiclass emotion
classification from textual data. While the standalone BERT model enhances overall performance,
achieving high precision in distinguishing emotional intensity and spectrum requires the
combination of contextual embeddings with additional features in a hybrid neural network
architecture.</p>
      <p>A correlation analysis was conducted using Pearson's coefficient to better understand the
relationships between the extracted features and the target mood variable. The results are
displayed as a correlation matrix (Figure 5), demonstrating the degree of linear dependence
between pairs of variables. The values of Pearson’s coefficient range from -1 to +1, where +1
indicates a strong positive correlation, -1 indicates a strong negative correlation, and 0 indicates no
linear dependence. The visualization uses a color scale, where red shades correspond to positive
and blue to negative correlations. The analysis of the correlation matrix showed a high
interconnectedness of the features characterizing the length of the text: there is a strong positive
correlation between the number of words (word_count), the number of unique words
(unique_word_count), the number of stop words (stop_word_count) and the length of the text
(text_length), with coefficients ranging from 0.65 to 0.98. It indicates that these features, on the one
hand, reflect a common aspect - the amount of textual content. The average word length
(average_word_length) showed a moderate negative correlation with the features related to the
number of words and text length (from -0.41 to -0.60), which can be interpreted as a tendency to
use shorter words in longer messages, a phenomenon typical of the informal style of
communication in social networks.</p>
      <p>Regarding the features related to format and metadata, it was found that the number of
punctuation marks (punctuation_count) positively correlates with text length (0.47), which is
logical since longer texts tend to include more punctuation. In contrast, the number of URL links
(url_link_count) shows a weak negative correlation with text length and word count
(approximately -0.25), suggesting that messages containing external links are typically more
concise. The number of hashtags (hashtag_count) exhibits a weak negative correlation with text
length (-0.21) and a moderate positive correlation with punctuation count (0.50), possibly indicating
stylistic patterns in tweets with a high number of hashtags. The number of mentions
(mention_count) shows almost no correlation with other features (coefficients between 0.04 and
0.18), highlighting its relative independence from linguistic characteristics.</p>
      <p>The most important finding concerns the correlation between all the aforementioned features
and the target variable, sentiment (Sentiment_Encoded). The analysis showed that none of the
surface-level linguistic or structural features exhibit a significant linear relationship with
sentiment: the correlation coefficients range from -0.02 to 0.03, effectively zero. These results
demonstrate the insufficiency of relying solely on traditional surface-level features for accurate and
semantically rich emotional tone analysis in social content. They further underscore the necessity
of employing more complex models capable of capturing nonlinear dependencies and leveraging
deep contextual text representations, such as those generated by the BERT model. The integration
of such representations with extended features, as proposed in our hybrid neural network
architecture, is reasonable and well-justified. Thus, the correlation analysis highlights the
importance of a comprehensive, multidimensional approach to overcoming the limitations of
traditional sentiment analysis methods.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions</title>
      <p>This study developed and experimentally validated a hybrid neural network architecture for
semantic-contextual analysis of emotions in social content. The proposed model combines
contextual vector representations of text generated by the BERT transformer model with
multidimensional extended user features and message metadata. To integrate these different types
of input data, a multilayer perceptron network is used to match linguistic and structured
information effectively. The results of the experimental comparison with traditional LSTM and
CNN architectures and the separate use of BERT or BERT embedding demonstrated the superiority
of the proposed approach in all key metrics. The hybrid BERT + MLP model achieved the highest
results: F1-measure (macro average) was 0.91, precision was 0.94, recall was 0.90, and overall
classification accuracy was 0.90. High performance was also recorded for individual emotional
classes: F1-measure for "Extremely Negative" was 0.93, for 'Neutral' 0.91, and for "Negative" 0.89.</p>
      <p>Additional correlation analysis revealed that superficial linguistic and structural features, such
as the number of punctuation marks, hashtags, mentions, or URL links, do not have a significant
linear relationship with the message's sentiment. However, including these features in the model
allowed us to identify complex non-linear relationships, increasing classification accuracy.</p>
      <p>Thus, the study's results confirm the feasibility and effectiveness of the proposed hybrid
architecture for emotional analysis of social content. Combining contextual language
representations based on BERT with extended structural characteristics allows for high accuracy in
information heterogeneity and stylistic variability of texts in social networks.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>The study was conducted as part of the international educational project “Safe Artificial Intelligence:
The European Legal Dimension” [101176092, a joint project of Dnipro University of Technology,
Erasmus+ Jean Monnet Foundation, and the European Education and Culture Executive Agency
(EACEA)]. Support from the European Commission for the publication of this work does not imply
endorsement of its content, which solely reflects the views and opinions of the authors, and the
Commission cannot be held responsible for any use that may be made of the information contained
therein.</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>The authors used Grammarly to check the grammar.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Salloum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Al-Emran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Monem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shaalan</surname>
          </string-name>
          ,
          <article-title>Using Text Mining Techniques for Extracting Information from Research Articles</article-title>
          ,
          <source>in: Intelligent Natural Language Processing: Trends and Applications</source>
          , Springer International Publishing, Cham,
          <year>2017</year>
          , pp.
          <fpage>373</fpage>
          -
          <lpage>397</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -67056-0_
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Drus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Khalid</surname>
          </string-name>
          ,
          <article-title>Sentiment Analysis in Social Media and Its Application: Systematic Literature Review, Procedia Comput</article-title>
          . Sci.
          <volume>161</volume>
          (
          <year>2019</year>
          )
          <fpage>707</fpage>
          -
          <lpage>714</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.procs.
          <year>2019</year>
          .
          <volume>11</volume>
          .174.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Voloshyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vysotska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Markiv</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Dyyak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Budz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Schuchmann</surname>
          </string-name>
          . “
          <source>Sentiment Analysis Technology of English Newspapers Quotes Based on Neural Network as Public Opinion Influences Identification Tool”</source>
          .
          <source>2022 IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT)</source>
          , IEEE,
          <year>2022</year>
          . doi:
          <volume>10</volume>
          .1109/csit56902.
          <year>2022</year>
          .
          <volume>10000627</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sunil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Beniwal</surname>
          </string-name>
          ,
          <article-title>Sentiment Analysis: A Tool for Mining Opinions and Emotions, SSRN Electron</article-title>
          . J. (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .2139/ssrn.3746951.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Alqahtani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Alqahtani</surname>
          </string-name>
          , S. AlYami,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alfayez</surname>
          </string-name>
          ,
          <source>Sentiment Analysis of Semantically Interoperable Social Media Platforms Using Computational Intelligence Techniques, Appl. Sci. 13</source>
          .13 (
          <year>2023</year>
          )
          <article-title>7599</article-title>
          . doi:
          <volume>10</volume>
          .3390/app13137599.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Paltoglou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Giachanou</surname>
          </string-name>
          , Opinion Retrieval:
          <article-title>Searching for Opinions in Social Media</article-title>
          , in: Professional Search in the Modern World, Springer International Publishing, Cham,
          <year>2014</year>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>214</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -12511-4_
          <fpage>10</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. J.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <article-title>Automatic construction of direction-aware sentiment lexicon using direction-dependent words</article-title>
          ,
          <source>Lang. Resour. Eval</source>
          . (
          <year>2024</year>
          ).
          <source>doi:10.1007/s10579-024-09737-9.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Automatic construction of target-specific sentiment lexicon, Expert Syst</article-title>
          .
          <source>With Appl</source>
          .
          <volume>116</volume>
          (
          <year>2019</year>
          )
          <fpage>285</fpage>
          -
          <lpage>298</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.eswa.
          <year>2018</year>
          .
          <volume>09</volume>
          .024.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Badaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jundi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hajj</surname>
          </string-name>
          , W. El-Hajj.
          <article-title>“EmoWordNet: Automatic Expansion of Emotion Lexicon Using English WordNet”</article-title>
          .
          <source>Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics</source>
          , Association for Computational Linguistics, Stroudsburg, PA, USA,
          <year>2018</year>
          . doi:
          <volume>10</volume>
          .18653/v1/s18-
          <fpage>2009</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Menaha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ananthi</surname>
          </string-name>
          ,
          <article-title>Reviewing the effectiveness of lexicon-based techniques for sentiment analysis in massive open online courses</article-title>
          ,
          <source>Int. J. Data Sci. Anal</source>
          . (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .1007/s41060-024- 00585-y.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Becken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stantic</surname>
          </string-name>
          ,
          <article-title>Lexicon based Chinese language sentiment analysis method</article-title>
          ,
          <source>Comput. Sci. Inf</source>
          .
          <source>Syst. 16.2</source>
          (
          <year>2019</year>
          )
          <fpage>639</fpage>
          -
          <lpage>655</lpage>
          . doi:
          <volume>10</volume>
          .2298/csis181015013c.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.-U.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sarma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sinha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sinha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pradhan</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          <article-title>Survey on Twitter Sentiment Analysis</article-title>
          ,
          <source>Int. J. Comput. Sci. Eng</source>
          .
          <volume>6</volume>
          .
          <issue>11</issue>
          (
          <year>2018</year>
          )
          <fpage>644</fpage>
          -
          <lpage>648</lpage>
          . doi:
          <volume>10</volume>
          .26438/ijcse/v6i11.
          <fpage>644648</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>V.J.</given-names>
            <surname>Kashtan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.V.</given-names>
            <surname>Hnatushenko</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.I. Shedlovska.</surname>
          </string-name>
          “
          <source>Processing technology of multispectral remote sensing images”</source>
          .
          <source>2017 IEEE International Young Scientists Forum on Applied Physics and Engineering</source>
          (YSF), Lviv, Ukraine,
          <year>2017</year>
          , pp.
          <fpage>355</fpage>
          -
          <lpage>358</lpage>
          . doi:
          <volume>10</volume>
          .1109/YSF.
          <year>2017</year>
          .
          <volume>8126647</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D.</given-names>
            <surname>Holubinka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vysotska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vladov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ushenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Talakh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tomka</surname>
          </string-name>
          ,
          <article-title>Intelligent System for Recognizing Tone and Categorizing Text in Media News at an Electronic Business Based on Sentiment and Sarcasm Analysis</article-title>
          ,
          <source>Int. J. Inf. Eng. Electron. Bus. 17.1</source>
          (
          <year>2025</year>
          )
          <fpage>90</fpage>
          -
          <lpage>139</lpage>
          . doi:
          <volume>10</volume>
          .5815/ijieeb.
          <year>2025</year>
          .
          <volume>01</volume>
          .06.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>V.</given-names>
            <surname>Kashtan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hnatushenko</surname>
          </string-name>
          ,
          <article-title>Deep Learning Technology for Automatic Burned Area Extraction Using Satellite High Spatial Resolution Images</article-title>
          ,
          <source>in: Lecture Notes in Data Engineering</source>
          , Computational Intelligence, and Decision Making, Springer International Publishing, Cham,
          <year>2022</year>
          , pp.
          <fpage>664</fpage>
          -
          <lpage>685</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -16203-9_
          <fpage>37</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>V. Y.</given-names>
            <surname>Kashtan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. V.</given-names>
            <surname>Hnatushenko</surname>
          </string-name>
          ,
          <article-title>Automated building damage detection on digital imagery using machine learning</article-title>
          ,
          <source>Nauk. Visnyk Natsionalnoho Hirnychoho Universytetu No. 6</source>
          (
          <year>2023</year>
          )
          <fpage>134</fpage>
          -
          <lpage>140</lpage>
          . doi:
          <volume>10</volume>
          .33271/nvngu/2023-6/134.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>