<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Lviv Polytechnic National University</institution>
          ,
          <addr-line>Bandera str.12, Lviv, 79013</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Sentiment analysis is a fundamental component of natural language processing (NLP), enabling the automated assessment of textual sentiment across different languages. However, widely used sentiment analysis tools, such as VADER, often struggle with language-specific challenges, particularly in morphologically rich and syntactically complex languages like Ukrainian. This study introduces an improved rule-based sentiment analysis algorithm specifically designed for Ukrainian-language texts, addressing the limitations of generic approaches. The proposed algorithm integrates an enhanced lexicon, including the EMOLEX sentiment dictionary, polarity scores, emoji sentiment mapping, and intensity boosters, to refine sentiment classification. Additionally, advanced dependency parsing and position-aware scoring mechanisms are employed to improve contextual understanding, enabling more accurate differentiation between positive, negative, and neutral sentiments. These enhancements are particularly crucial for capturing Ukrainian-specific linguistic structures, which pose difficulties for existing sentiment analysis models. The algorithm's effectiveness was evaluated using Ukrainian-language datasets, comparing its performance against the widely used VADER sentiment analysis tool. The results demonstrate that the custom algorithm significantly outperforms VADER in detecting sentiment polarity, particularly in cases with strong positive or negative sentiment. This confirms the necessity of languagespecific sentiment analysis tools for non-English content, as they provide greater accuracy and contextual sensitivity. Despite the promising results, further improvements remain possible. One key area for future research involves integrating artificial intelligence (AI) techniques, such as machine learning and deep learning, to create a hybrid framework that enhances the accuracy of sentiment classification, especially for ambiguous or nuanced expressions. Sentiment analysis, Ukrainian language, rule-based algorithm, natural language processing, dependency parsing, lexicon expansion, sentiment classification CLW-2025: Computational Linguistics Workshop at 9th International Conference on Computational Linguistics and Intelligent Systems (CoLInS-2025), May 15-16, 2025, Kharkiv, Ukraine ∗ Corresponding author. † These authors contributed equally.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        1. Introduction
Sentiment analysis is a pivotal function of natural language processing (NLP) that allows one to
extract opinions and emotions from texts, as well as attitudes [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This is important for market
research and social media monitoring, and it is also useful in analyzing customer feedback [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Millions of people speak Ukrainian all over the world; it is gaining more and more importance
among digital communication, and even more enthusiastic audiences, such as social media, news
sites, and customer reviews. However, the very high complexity of the natural language, with rich
morphology and flexible syntax, a lot of negations, and idiomatic expressions, makes today existing
algorithms for sentiment analysis helpless [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. Most of the popular sentiment analysis tools, such
as VADER, are designed to understand English and do not cover the grammatical structures and
lexical properties of Ukrainian, which brings down the accuracy rates of those tools when applied to
Ukrainian texts [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Sentiment analysis has evolved from traditional rule-based approaches to modern deep learning
techniques, enabling more accurate and context-aware classification of emotions in text [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. While
early methods, such as lexicon-based sentiment scoring, were effective for structured languages, they
often fail to capture complex syntactic dependencies in morphologically rich languages like
Ukrainian. Recent advancements in natural language processing (NLP) have introduced powerful
transformer-based models, such as Ukr-RoBERTa and Multilingual BERT (mBERT), which
significantly improve sentiment classification by leveraging deep contextual embeddings [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These
models are trained on vast multilingual datasets and can generalize well across different languages,
including Ukrainian. However, challenges remain, particularly in handling domain-specific
sentiment expressions, idiomatic phrases, and negation structures. The following analysis examines
key research contributions and existing sentiment analysis tools, emphasizing their methodologies,
limitations, and applicability to Ukrainian-language content [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        Despite these advancements, sentiment analysis for the Ukrainian language remains
underdeveloped, with many widely used tools being optimized primarily for English. Existing
approaches struggle to fully adapt to the rich morphology, flexible word order, and unique sentiment
expressions present in Ukrainian texts [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This highlights the need for more specialized research
and the development of tailored sentiment analysis models that can accurately interpret sentiment
in Ukrainian-language content. Addressing these challenges will not only improve sentiment
classification accuracy but also enhance the applicability of NLP techniques for Ukrainian in fields
such as social media monitoring, customer feedback analysis, and political discourse evaluation [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        The current research aims to develop an enhanced rule-based sentiment analysis algorithm
specifically tailored for Ukrainian-language content [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. To address the challenges highlighted in
previous studies and overcome the limitations of existing sentiment analysis tools, this work focuses
on the following key tasks.
      </p>
      <p>Expansion of Ukrainian sentiment lexicons. Integrating a comprehensive sentiment lexicon that
accounts for Ukrainian-specific linguistic features, including domain-specific vocabulary, slang, and
idiomatic expressions. Incorporating emoji sentiment mappings to improve the classification of
informal and digital communication, which is essential for social media and online content analysis.</p>
      <p>
        Integration of dependency-based syntax parsing. Utilizing syntactic dependency parsing to capture
contextual relationships between words, allowing for more accurate sentiment classification in a
morphologically rich and flexible word-order language like Ukrainian [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>Refinement of sentiment intensity modifiers. Developing an improved approach to handling
intensity modifiers (e.g., “дуже” – very, “майже” – almost, “надзвичайно” – extremely) to ensure
correct sentiment scaling. Enhancing positional effect handling, ensuring that the placement of
sentiment-bearing words in a sentence (e.g., at the beginning or end) influences classification
outcomes appropriately.</p>
      <p>
        Comparison with existing sentiment analysis models. Conducting a quantitative and qualitative
comparison of the proposed rule-based model with VADER, a widely used sentiment analysis tool
optimized for English [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Benchmarking against pre-existing methodologies to demonstrate the
strengths and weaknesses of a rule-based approach compared to statistical and deep learning
methods for sentiment analysis in non-English languages.
      </p>
      <p>
        By fulfilling these research objectives, this study aims to enhance sentiment classification
accuracy for Ukrainian-language content, contributing to the development of specialized NLP tools
for underrepresented languages [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The proposed improvements provide a foundation for hybrid
sentiment analysis models that can combine rule-based and machine-learning approaches in future
studies.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>
        Recent studies have explored hybrid techniques that combine rule-based processing with machine
learning to address linguistic nuances and improve classification performance. The following
analysis reviews key research contributions, highlighting their methodologies, strengths, and
limitations in advancing sentiment analysis:








"Mining and Summarizing Customer Reviews" [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] authors M. Hu, B. Liu introduce a novel
approach for extracting sentiment information from customer reviews. Their method focuses
on summarizing reviews by identifying product features and categorizing opinions as
positive or negative. The study presents an unsupervised learning model that extracts
sentiment-oriented phrases and organizes them into structured summaries. This work laid
the foundation for many modern sentiment analysis systems by emphasizing feature-based
sentiment extraction.
      </p>
      <p>
        In "A Joint Model of Text and Aspect Ratings for Sentiment Summarization" [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] I. Titov, R.
McDonald propose a probabilistic model for sentiment summarization, integrating textual
reviews with numerical aspect ratings. Their joint modeling approach allows the system to
generate more accurate and aspect-specific sentiment summaries. The paper demonstrates
how this method improves the interpretability of sentiment classification, making it
particularly useful for product review analysis.
      </p>
      <p>
        In the work "A Real-time Hand Gesture Recognition System for Human-Computer and
Human-Robot Interaction" [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the proposed gesture recognition system is designed to
improve human-computer interaction and human-robot interaction. As the authors of the
study assure, such interaction ensures natural and intuitive communication between people
and technology using gestures.
      </p>
      <p>
        Authors of "Determining the Sentiment of Opinions" [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] explore the challenges of
determining the sentiment of user opinions by developing a method that distinguishes
between subjectivity and sentiment polarity. Their approach combines machine learning
techniques with rule-based linguistic analysis to improve classification accuracy. A key
contribution of this work is its focus on contextual sentiment detection, which enhances its
applicability to opinion mining.
      </p>
      <p>
        In "Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised
Classification of Reviews" [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] P. D. Turney introduces an unsupervised method for
sentiment classification using semantic orientation. The approach leverages pointwise
mutual information (PMI) to measure the association between words and their sentiment
polarity. This study is notable for its effectiveness in review classification without requiring
labeled training data, making it a significant milestone in early sentiment analysis research.
In "Thumbs Up? Sentiment Classification Using Machine Learning Techniques" [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] B. Pang,
L. Lee, S. Vaithyanathan present one of the first applications of machine learning for
sentiment classification. The study compares Naïve Bayes, maximum entropy, and support
vector machines (SVM) for sentiment polarity classification on movie reviews. Their findings
demonstrate that SVM outperforms other classifiers, establishing it as a dominant technique
in early sentiment analysis research.
      </p>
      <p>
        Authors of "Learning Extraction Patterns for Subjective Expressions" [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] propose an
approach to extract subjective expressions from text. Their method relies on pattern-based
learning techniques to identify phrases expressing sentiment. This work is critical in
advancing fine-grained sentiment analysis, particularly for detecting implicit opinions that
may not contain explicit sentiment words.
      </p>
      <p>
        In "Peculiarities of an Information System Development for Studying Ukrainian Language
and Carrying out an Emotional and Content Analysis" [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] authors present a study on the
development of an information system designed for the analysis of Ukrainian-language
content. Their research focuses on integrating emotional and content-based sentiment
analysis techniques, addressing the unique linguistic challenges posed by the Ukrainian
language.
      </p>
      <p>
        Building on the challenges identified in previous research, this study continues the development
of sentiment analysis tools specifically designed for Ukrainian-language content. Existing sentiment
analysis models, including multilingual transformer-based approaches, still struggle with the
morphological complexity, rich syntax, and unique contextual dependencies of Ukrainian [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. To
address these issues, this research proposes an enhanced rule-based sentiment analysis algorithm
that leverages expanded lexicons, dependency parsing, and refined rule-based logic to achieve more
precise sentiment classification [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. This will deal with linguistic and contextual problems that
Ukrainian is subjected to but is rarely encountered in current frameworks. Rule-based systems offer
interpretable and transparent decision-making processes compared to other black-box methods [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ].
      </p>
      <p>
        This research is based on the preliminary work presented in "Naive Rule-Based Method in
Sentiment Analysis of Ukrainian Language Content," where a simple rule-based algorithm was
introduced to conduct sentiment analysis on Ukrainian text [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. The baseline method used
predefined positive and negative lexicons along with a few grammatical rules for handling negation
and modifiers [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. The study underlined several critical limitations:



      </p>
      <p>Contextual Insensitivity. The naive method was insensitive to grammatical and syntactic
relationships existing between words, hence cutting the accuracy when facing texts
comprising complex sentence structures.</p>
      <p>Negation Handling. The basic rules of negations were considered, but they did not capture
subtle interactions between negations and word intensities.</p>
      <p>
        Lexicon Coverage. The small lexicon used resulted in low recall for texts containing slang,
idiomatic expressions, or domain-specific terms [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods and Materials</title>
      <p>In order to present the main aspects of the studied subject area, a scheme was finalized that
reflects the main stages that must be implemented in the sentiment analysis system (Fig. 1).</p>
      <p>As it is displayed on the picture above the analysis process consists of 3 main parts:
 Preprocessing – removing extra information from the data to analyze.
 Sentiment analysis – main process of evaluating each token.
 Final sentiment calculation – summing up the result of a sentiment analysis and
compounding the score into one result.
3.1.</p>
      <sec id="sec-3-1">
        <title>Description of the New Algorithm: Lexicon</title>
        <p>
          The rule-based sentiment analysis algorithm proposed here will be using the rich lexicon, which
is tuned for Ukrainian-language content. These take into account the linguistic and emotional
subtleties of sentiment classification [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ].
        </p>
        <p>
          It is using the extended Ukrainian version of the EMOLEX lexicon. It is a great source whereby
the set of Ukrainian words carries with it sentimental labels classifying the expressions in categories
of positive, negative, and neutral sentiments, along with categories of joy, anger, trust, and fear. Such
a label would allow the algorithm to identify the emotional intensity accurately [
          <xref ref-type="bibr" rid="ref29 ref30">29, 30</xref>
          ].
        </p>
        <p>Additionally, a supplementary polarity lexicon extends EMOLEX with sentiment scores for less
common and domain-specific terms. Words are scored on their polarity, from highly negative to
highly positive, allowing a finer granularity and increased coverage for the algorithm [31].</p>
        <p>The inclusion of an expanded emoji sentiment mapping allows the algorithm to process informal
communication, such as social media texts. Sentiment scores are assigned to commonly used emojis,
categorizing them as positive (e.g., “ ”, “ ”), negative (e.g., “ ”, “ ”), or neutral (e.g., “ ”).
This enhances the algorithm’s ability to classify modern digital texts accurately.</p>
        <p>The algorithm relies on a sophisticated set of intensity booster words that raise or lower the
emotional impact of surrounding words. For example, words like “дуже” (“very”) or “абсолютно”
(“absolutely”) increase the intensity of positive or negative sentiment, while modifiers like “трохи”
(“slightly”) reduce it [32].</p>
        <p>A large phrase sentiment lexicon was used to score multi-word expressions and idiomatic phrases.
This resource ensures that the algorithm can capture the sentiment of complex phrases, such as “на
межі розпачу” (“on the verge of despair”), which would otherwise be lost in word-by-word analysis.</p>
        <p>It then filters out stopwords, or words that occur frequently and do not contribute to the
sentiment, such as “і” (“and”) or “або” (“or”). It uses a hand-curated list of Ukrainian stopwords so
that only meaningful words are looked at for sentiment, thereby improving both accuracy and
efficiency.</p>
        <p>This combination of resources provides a strong foundation for the algorithm in handling the
variety of vocabulary, expressions, and informal modes of communication present in content written
in the Ukrainian language.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Description of the New Algorithm: Text Preprocessing</title>
        <p>Preprocessing of text accurately is one main step in the rule-based sentiment analysis algorithm
proposed herein [33]. This would make sure that input texts are transformed into a structured format
for analysis while preserving the nuances of language and context.</p>
        <p>The first step in text processing is tokenization, where the input text is divided into smaller units
called tokens. Tokens can include words, punctuation marks, and emojis. This process is designed to
handle Ukrainian-language content effectively by:



preservation of the structure of words with rich morphology;
retention of punctuation marks such as "!" and ".", which later will be analyzed for their
influence on sentiment;
isolating and detecting emojis, since they are treated as independent sentiment-bearing
units.</p>
        <p>For example, the sentence: "Це було неймовірно красиво, але трохи сумно !" is tokenized
into: ["Це", "було", "неймовірно", "красиво", ",", "але", "трохи", "сумно", " ", "!"].</p>
        <p>It captures contextual sentiment and idiomatic expressions by using N-gram analysis. N-grams
refer to sequences of N consecutive tokens, which are important for identifying multi-word
expressions and phrases that carry sentiment. The algorithm processes unigrams or single tokens
for analyzing individual words such as "красиво" ("beautiful"); bigrams or two-word phrases, which
capture information in context, for example, "трохи сумно" ("slightly sad"); trigrams or three-word
phrases to identify more complex expressions, such as "на межі розпачу" ("on the verge of despair")
[34]. N-grams enable the algorithm to bring in phrase-level sentiment, enhancing its capability to
deal with subtle language constructs that may not be captured using a purely word-based approach
[35].</p>
        <p>The integration of tokenization and N-gram analysis ensures that the algorithm captures both the
sentiment of individual words and the contextual meaning of phrases, hence making the
classification more accurate.
3.3.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Description of the New Algorithm: Sentiment Analysis</title>
        <p>To solve the linguistic complexity of Ukrainian, the suggested algorithm of sentiment analysis
will include the component of dependency analysis. Such a module will identify grammatical
relations between words, which will enable the algorithm to consider context and interactions within
a sentence [36]. Dependency analysis enhances sentiment classification by handling key linguistic
phenomena: negations, modifiers, and punctuation.</p>
        <p>Negation plays an important role in sentiment polarity. The algorithm recognizes negation words
like "не" or "ні" and adjusts the sentiment of the words associated with them.</p>
        <p>Example:</p>
        <p>Input. "Це не гарно" ("This is not beautiful").</p>
        <p>Without negation handling. Positive due to "гарно" ("beautiful").</p>
        <p>With negation handling. Negative due to "не" ("not").</p>
        <sec id="sec-3-3-1">
          <title>The adjusted sentiment score is calculated as (Eq. 1):</title>
        </sec>
        <sec id="sec-3-3-2">
          <title>Where: The algorithm uses syntactic dependencies to link negations to their target words, ensuring accurate sentiment reversal. Let:</title>
          <p>(  ) = − ⋅  (  )
(1)
S(w) – sentiment score of a word   .
  negation word (e.g., "не", "ні").
(  ,   ) – syntactic dependency linking</p>
          <p>to its target word   .
 (  ) – adjusted sentiment score of the target word   .</p>
          <p>– amplification factor (e.g.,  =1.5) to increase the effect of negation.












</p>
          <p>Modifiers are words such as intensity boosters: "дуже" ("very"), "абсолютно" ("absolutely") or
reducers -"трохи" ("slightly") which increase or reduce the emotional weight of words. Example:

</p>
          <p>Input. "Це дуже гарно" ("This is very beautiful").</p>
          <p>Sentiment score for "гарно" is increased due to the booster "дуже".</p>
          <p>By leveraging dependency relationships, the algorithm ensures that modifiers are correctly
associated with their target words.</p>
          <p>Let:
 ( ) – sentiment score of a word  .
  – modifier word (e.g., "дуже" - "very", "абсолютно" - "absolutely", "трохи" - "slightly").
 (  ,   ) – syntactic dependency linking   to its target word   .
 (  ) – adjusted sentiment score of the target word   .
 (  ) – modifier weight, where  &gt; 1 for boosters (e.g., "дуже"), and 0 &lt;  &lt; 1 for reducers
(e.g., "трохи").
 (</p>
          <p>) – the weight depends on the intensity or reducing effect of the modifier.</p>
          <p>For example:  ("дуже") = 1.5 (boosts sentiment),  ("трохи") = 0.8 (reduces sentiment).
Punctuation marks, such as exclamation points ("!") and ellipses ("..."), often convey additional
emotional context [37]. The algorithm adjusts sentiment scores based on the presence and type of
punctuation:</p>
          <p>Exclamation marks. Amplify sentiment intensity. Example: "Це чудово!" ("This is
wonderful!") has a higher sentiment score due to the exclamation mark.</p>
          <p>Reduce sentiment intensity, indicating hesitation or uncertainty. Example: "Це цікаво..."
("This is interesting...") has a lower sentiment score due to the ellipsis.
 ( ) – sentiment score of a word  .
 – punctuation mark associated with the sentence (e.g., "!" or "...").
 ( ) – adjusted sentiment score of the word www.
 ( ) – punctuation multiplier, where: γ("! ") &gt; 1 (amplifies sentiment intensity), 0 &lt;
γ("...") &lt; 1 (reduces sentiment intensity).</p>
          <p>The adjusted sentiment score is calculated as (Eq. 3):</p>
          <p>( ) =  ( ) ∗  ( )
The adjusted sentiment score is calculated as (Eq. 2):
 (  ) =  ( 
) ⋅  (  )




(2)
(3)
 (!) = 1.5 (example amplification factor for exclamation marks).</p>
          <p>("...") = 0.8 (example reduction factor for ellipses).
o
o
o</p>
          <p>Input. "Це не дуже гарно!" ("This is not very beautiful!")
Dependency tree:
"не" → modifies → "гарно".
"дуже" → modifies → "гарно".</p>
          <p>"!" → modifies → overall sentiment intensity.</p>
          <p>Using a dependency parser, the algorithm builds a tree structure for each sentence, identifying
grammatical relationships between words. For example:</p>
          <p>The algorithm processes these relationships to adjust sentiment scores dynamically, improving
accuracy in handling complex sentences. By incorporating dependency analysis, the algorithm
captures much subtler linguistic interactions that usually elude simpler, rule-based systems. This
yields much more detailed and accurate sentiment classification [38].</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Description of the New Algorithm: Final Sentiment Calculation</title>
        <p>The scoring of sentiment is further enhanced by including multipliers and positional weights to
enable even finer levels of sentiment classification by modifying the intensity of emotional words
and phrases. The approach ensures that contextually important words and positions of sentences are
weighted appropriately [39].</p>
        <p>The algorithm explicitly includes boosters-words that increase or decrease the intensity of
sentiment-into the scoring mechanism. The algorithm assigns predefined weights to the boosters,
based on their strength and direction. As for amplifiers, the words like "дуже" ("very") or
"абсолютно" ("absolutely") increase the intensity of the associated sentiment. For example, "Це дуже
гарно" ("This is very beautiful") where a sentiment score for "гарно" is multiplied by a factor of 1.5
due to the booster "дуже". Reducers have words like "трохи" ("slightly") or "майже" ("almost") which
reduce the intensity. For instance, "Це трохи сумно" ("This is slightly sad") where a sentiment score
for "сумно" is multiplied by a factor of 0.7.</p>
        <p>The algorithm identifies those words through the dependency parsing applied with the proper
multipliers of the corresponding sentiments to dynamically adapt the sentiment score.</p>
        <p>The placement of a word or a phrase in the sentence can indeed have a significant impact on the
overall sentence. To overcome this, an algorithm assigns a positional weight:</p>
        <p>Beginning of the sentence. Words at the start of a sentence often set the tone and are assigned
higher weights. For instance, "Чудово, але трохи складно" ("Wonderful, but slightly
difficult"), where "Чудово" ("Wonderful") receives a higher weight, emphasizing its
influence.</p>
        <p>End of the sentence. Words at the end of a sentence often leave a lasting impression and are
given slightly higher weights than words in the middle. Example is "Це було добре, але
складно" ("It was good, but difficult"), where "Складно" ("Difficult") receives a higher weight
due to its sentence-ending position.


Let:




Where:



 
 
 
  (  ) &gt;   (  ) &gt;   ( 
positions).</p>
        <p>( ) – Adjusted sentiment score of the word  .</p>
        <p>The adjusted sentiment score is calculated as (Eq. 4):
 ( ) – sentiment score of a word  .
  ( ) – positional weight assigned to the word www, based on its position in the sentence.</p>
        <p>) – higher weights are assigned to the start and end
 ( ) =   ( ) ∗  ( )
(4)
(  ) = 1.5 (example weight for words at the start of the sentence).
(  ) = 1.2 (example weight for words at the end of the sentence).
(</p>
        <p>) = 1.0 (default weight for words in the middle of the sentence).</p>
        <p>By having an extended version of lexicon dictionaries, preprocessing tools, sentiment analysis
algorithms and composing results utilities the custom sentiment analysis system for Ukrainian
language content can be built.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment</title>
      <p>VADER (Valence Aware Dictionary and Sentiment Reasoner) is as a baseline for evaluating the
effectiveness of the rule-based sentiment analysis algorithm. This was considered a good benchmark
because of its rule-based approach, considering punctuation, emojis, and intensity modifiers.
Nevertheless, VADER is mostly optimized for English-language content, and its direct use for
Ukrainian texts is challenging [40].</p>
      <sec id="sec-4-1">
        <title>4.1. Selection of VADER as the Baseline</title>
        <p>The choice of VADER as a baseline for comparison in this study was influenced by several key
factors. Firstly, VADER is widely recognized and utilized in sentiment analysis research and industry
applications, particularly for analyzing short texts in social media contexts. Its popularity stems from
its ability to efficiently process sentiment in informal and digital communication.</p>
        <p>Secondly, VADER offers a transparent and interpretable rule-based approach, similar to the
proposed algorithm. By relying on predefined lexicons and scoring mechanisms, it allows for a direct
comparison of methodologies without the opacity often associated with deep learning models.</p>
        <p>Additionally, VADER incorporates various non-lexical features, such as punctuation handling,
capitalization detection, and intensity modifiers. These features align with the enhancements
introduced in the custom rule-based algorithm, making it a suitable benchmark for evaluating
sentiment classification techniques.</p>
        <p>However, VADER has significant limitations when applied to the Ukrainian language. It lacks
Ukrainian-specific lexicons, dependency parsing, and linguistic resources necessary for accurate
sentiment interpretation. As a result, its application to Ukrainian texts requires modifications or
adaptations to achieve reliable results, highlighting the need for language-specific sentiment analysis
models.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Testing Methodology</title>
        <p>Both algorithms are tested with the same dataset of Ukrainian-language texts. The dataset
contains pre-labeled samples in three categories: positive, neutral, and negative. It used the same
inputs to test both algorithms on identical inputs as a way of checking whose performance will excel.</p>
        <p>The steps in the methodology are:
1. Preprocessing. The texts were cleaned to remove noise, such as extra spaces and special
characters. In the case of VADER, texts were translated into English using Python's translate
library. This was necessary because VADER works exclusively with English texts. The
custom algorithm processed the original Ukrainian texts without translation.
2. Sentiment classification. VADER and the custom algorithm independently classified each text
into positive, neutral, or negative categories and assigned a compound score representing
sentiment intensity.
3. Evaluation metrics.</p>
        <p>• Accuracy – the proportion of correctly classified texts.
• F1-score – a harmonic mean of precision and recall for each sentiment category.
• Mean compound score – the average compound score for each sentiment category to align
with human annotation.
4. Error analysis. Several misclassifications were analyzed to understand the pattern and edge
cases where one algorithm performed better than the other.
5. Visualization. Comparative sentiment distribution (positive, neutral, negative) through bar
charts and mean compound scores for both algorithms in statistical summaries.</p>
        <p>A context diagram of the design system (Fig. 2) was built to further explain the process behind
comparison of two algorithms.</p>
        <p>In the specified model, the input receives raw Ukrainian text data, which serves as the basis for
sentiment analysis. This data can originate from various sources, including user-generated content,
social media posts, or datasets prepared for research purposes. The output of the system is the
Sentiment classification, which reflects whether the analyzed text conveys positive, negative, or
neutral sentiment.</p>
        <p>The system is influenced by several control components:
•
•
•</p>
        <p>Lexicons and rules. These provide the semantic and syntactic frameworks needed for
accurate sentiment detection. Lexicons include emotional word dictionaries, sentiment
polarity scores, and rules that define language-specific sentiment cues.</p>
        <p>Preprocessing methods. This refers to the set of techniques used to prepare raw text for
analysis, including tokenization, stopword removal, and normalization processes.
Computational environment. This includes the hardware and software infrastructure that
supports the processing and analysis of data, ensuring system performance and scalability.</p>
        <p>The Sentiment analysis system operates by integrating these inputs and controls to generate
reliable sentiment classifications based on predefined rules and linguistic resources. To gain a deeper
understanding of the sentiment analysis workflow, the context diagram was decomposed into several
sub-processes (Fig. 3).</p>
        <p>The decomposition diagram outlines the system as a series of interconnected stages, each
performing a specific role within the sentiment analysis pipeline:
•</p>
        <p>Text preprocessing. The first sub-process is responsible for preparing raw Ukrainian text
data. This step ensures that the input data is clean, structured, and ready for sentiment
analysis.
•
•</p>
        <p>Sentiment calculation. After preprocessing, the processed text vector is passed to the
sentiment calculation module. The output of this stage is the Result of Sentiment Analysis, a
numerical or categorical representation of sentiment.</p>
        <p>Result interpretation. The final stage translates the analytical results into human-readable
sentiment classifications. The output, Sentiment Classification, is presented to the user or
passed to other systems for further use.</p>
        <p>To enhance understanding, the sub-process of text preprocessing has been broken down further
(Fig. 4) for clarification.</p>
        <p>Text preprocessing involves three main steps. The first step, tokenization, breaks down the input
text into individual tokens, such as words, punctuation, and symbols, while preserving meaningful
text structures. The output of this process is tokenized input data, which serves as an intermediate
representation for further refinement. The next step is stopwords filtering, where common
stopwords, such as "і", "та", "або" in Ukrainian, are removed because they do not contribute to
sentiment analysis. This results in a filtered set of tokens that are more relevant for sentiment
classification. Finally, N-gram detection identifies sequences of words, such as bigrams or trigrams,
that may carry contextual sentiment meaning. The output of this step is a processed vector of data,
ready for sentiment analysis.</p>
        <p>To better explain how different parts of the system work together a class diagram was built (Fig.
5). The Sentiment Analysis System consists of key entities that work together to process and analyze
Ukrainian-language text. The TextProcessor handles initial preprocessing tasks such as tokenization,
stopword removal, and n-gram generation. It prepares the raw text data for analysis. The Lexicon
manages sentiment-related resources, including emotion lexicons, phrase sentiment scores, emoji
sentiment mappings, and booster words, which are used to determine the polarity and intensity of
sentiments within the text. The DependencyParser focuses on the syntactic structure of sentences,
identifying elements like negations, modifiers, and punctuation that can influence sentiment. This
information is critical for accurate sentiment adjustments. The SentimentAnalyzer serves as the core
component, combining the outputs from the TextProcessor, Lexicon, and DependencyParser to
calculate sentiment scores. It adjusts these scores based on contextual factors such as negations and
modifiers.</p>
        <p>For comparative analysis, the system includes a VaderAnalyzer, which applies the VADER
sentiment analysis method, particularly useful after translating Ukrainian text into English. The
SentimentComparer brings everything together, comparing results from the custom
SentimentAnalyzer and the VaderAnalyzer. It also handles translation processes and visualizes
sentiment distribution through graphs, allowing for an easy comparison of both methods'
performance.</p>
        <p>In order to show how both algorithms work in comparison, a software tool was built using
modern UI and modern frameworks (Fig. 6).</p>
        <p>The comparative methodology highlights the flexibility of the custom algorithm for
Ukrainianlanguage content and indicates where VADER, with its reliance on English-specific resources, has
limited accuracy. This dual evaluation provides valuable insights into the strengths and weaknesses
of rule-based approaches across different languages.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>This section assesses the performance of the proposed custom rule-based sentiment analysis
algorithm compared to the VADER sentiment analysis tool. It evaluates its analysis on
Ukrainianlanguage tweets and their ability to categorize the sentiment into positive, neutral, and negative.
5.1.</p>
      <sec id="sec-5-1">
        <title>Analysis of Specific Keywords</title>
        <p>The subsets of the dataset with certain words with specific sentiments were taken, such as the
words "добре" meaning "good", "добрий" meaning "kind", "погано" meaning "badly", and "поганий"
meaning "bad".</p>
        <p>"Добре" (Good)</p>
        <p>For the cases of "погано", the custom analyser gave better results as well than the identifications
of negatives. Figure 9 depicts 65% vs 49%, where the custom analyst classified the number of tweets
based on the determination of the negatives done by VADER. The -0.44 mean compound value of
the former was closer compared to the score of the later, which averaged at -0.22 with respect to
sentiments.</p>
        <p>Figure 10 displays the results for "поганий". The custom analyzer classified 78% of the tweets as
negative, while VADER did so for 61%. In addition, the mean compound score of -0.61 for the custom
analyzer was much lower than VADER's -0.31, meaning that the former better matched the intensity
of negative feeling in the dataset.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Discussions</title>
      <p>Here is a summary of the experiments results in a table comparing two sentiment analysis
methods Vader Analyzer and Custom Analyzer (Table 1).</p>
      <p>The results from table 1 and figure 11 have indicated that the custom rule-based analyzer performs
superiorly to VADER on all tested datasets. The custom analyzer proves better at both highly positive
and highly negative texts classification because of its following strengths:
1. Expanded lexicon. EMOLEX, polarity_score.csv, intensity booster words, and large phrase
sentiment have included to enhance coverage for sentiment-laden words and phrases.
2. Context-aware adjustments. The algorithm makes use of dependency analysis and
positionbased weighting to account for modifiers, negations, and punctuation, enhancing the
accuracy of sentiment classification.
3. Emoji sentiment recognition. A carefully curated emoji lexicon enables better handling of
modern text features ignored by other algorithms.</p>
      <p>Moreover, text translation for VADER added noise to the dataset, which may impact its results
on a Ukrainian-language dataset. This once again points out the need for language-specific SA tools.</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>This study introduced a novel rule-based sentiment analysis algorithm specifically designed for
the Ukrainian language. By integrating a diverse set of linguistic resources, including the EMOLEX
lexicon, polarity scores, emoji sentiment mapping, and intensity boosters, the proposed approach
effectively addresses key challenges in Ukrainian-language sentiment analysis. Additionally, the
incorporation of advanced dependency parsing and position-aware scoring enhances the algorithm’s
ability to process complex linguistic structures.</p>
      <p>The evaluation, conducted using datasets from a previous study, demonstrates that the custom
algorithm surpasses VADER in identifying sentiment polarity, particularly for texts with clear
positive or negative sentiment. However, VADER remains competitive in detecting neutral content
due to its generalized optimization for multiple languages. These findings underscore the necessity
of language-specific sentiment analysis tools for non-English content.</p>
      <p>The comparative analysis further confirms that a domain-specific rule-based algorithm, when
supported by a well-structured lexicon and carefully designed linguistic rules, can achieve
performance levels comparable to widely used sentiment analysis tools like VADER. The results
highlight the potential of tailored linguistic approaches in improving sentiment analysis for
underrepresented languages, paving the way for further advancements in this domain.</p>
      <p>While the current state of the rule-based algorithm has shown great promise, there is much room
for further improvement in several ways:</p>
      <p>Adding AI Future editions could include integrating machine learning or deep learning
models for further enhancement of the rule-based system. Hybrid approaches, that combine
rule-based techniques with neural network models, would likely enhance accuracy on
ambiguous and nuanced content further.
2. Extension to other languages. After the success of the Ukrainian language model, the next
step would be to extend the approach to other underrepresented languages. Multilingual
capabilities would make the algorithm more applicable and increase its impact.
3. Dynamic lexicon updating. Extending the algorithm to automatically adapt its lexicon to
emerging trends, slang, and domain-specific terminology through web scraping and natural
language processing techniques.
4. Sentiment granularity. Future work could include finer grain analysis, for instance, of the
intensity of a sentiment as "somewhat" versus "highly" positive.
5. Benchmarking against AI models. Future studies can be conducted for benchmarking with
state-of-the-art AI-based sentiment analysis models, such as BERT or GPT, to position the
advantages and disadvantages of the rule-based approach relative to these models.</p>
      <p>This study thus constitutes the basis for further explorations of language- and domain-specific
sentiment analysis. This approach, though, has its special promise, in that integrating AI models
allow for a finally achieved balance between interpretability typical of rule-based methods and
adaptability of machine learning techniques.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
[31] Graves, N. Jaitly, A. Mohamed, Hybrid Speech Recognition with Deep Bidirectional LSTM,
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding,
2013, pp. 273-278.
[32] Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin,
Attention is All You Need, Advances in Neural Information Processing Systems, 30, 2017, pp.
5998-6008.
[33] J. Devlin, M. W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding, Proceedings of the 2019 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language
Technologies, 2019, pp. 4171-4186.
[34] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient Estimation of Word Representations in</p>
      <p>Vector Space, arXiv preprint arXiv:1301.3781, 2013.
[35] P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching Word Vectors with Subword
Information, Transactions of the Association for Computational Linguistics, 5, 2017, pp.
135146.
[36] J. Pennington, R. Socher, C. D. Manning, GloVe: Global Vectors for Word Representation,
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing
(EMNLP), 2014, pp. 1532-1543.
[37] Radford, K. Narasimhan, T. Salimans, I. Sutskever, Improving Language Understanding by</p>
      <p>Generative Pre-Training, OpenAI preprint, 2018.
[38] Radford, J. Wu, R. Child, D. Luan, D., Language Models are Unsupervised Multitask Learners,</p>
      <p>OpenAI preprint, 2019.
[39] T. Brown, B. Mann, N. Ryder, M. Subbiah, Language Models are Few-Shot Learners, Advances
in Neural Information Processing Systems, 33, 2020, pp. 1877-1901.
[40] Y. Liu, M. Ott, N. Goyal, J. Du, RoBERTa: A Robustly Optimized BERT Pretraining Approach,
arXiv preprint arXiv:1907.11692, 2019.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ranjan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Comprehensive Study on Sentiment Analysis: From Rule-based to Modern LLM-based Systems</article-title>
          ,
          <source>arXiv preprint arXiv:2409.09989</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Kotelnikova</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Paschenko</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Bochenina</surname>
          </string-name>
          , E. Kotelnikov,
          <article-title>Lexicon-based Methods vs</article-title>
          .
          <source>BERT for Text Sentiment Analysis, arXiv preprint arXiv:2111.10097</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Vilares</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gómez-Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Alonso</surname>
          </string-name>
          , Universal,
          <source>Unsupervised (Rule-Based)</source>
          ,
          <source>Uncovered Sentiment Analysis, arXiv preprint arXiv:1606.05545</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>O.</given-names>
            <surname>Kellert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. U.</given-names>
            <surname>Zaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. H.</given-names>
            <surname>Matlis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gómez-Rodríguez</surname>
          </string-name>
          ,
          <article-title>Experimenting with UD Adaptation of an Unsupervised Rule-based Approach for Sentiment Analysis of Mexican Tourist Texts</article-title>
          ,
          <source>arXiv preprint arXiv:2309.05312</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Al-Harbi</surname>
          </string-name>
          ,
          <article-title>Negation Handling in Machine Learning-Based Sentiment Classification for Colloquial Arabic</article-title>
          ,
          <source>arXiv preprint arXiv:2107.11597</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Mediakov</surname>
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Basyuk</surname>
            <given-names>T.</given-names>
          </string-name>
          <article-title>Specifics of Designing and Construction of the System for Deep Neural Networks Generation /</article-title>
          / CEUR Workshop Proceedings. -
          <year>2022</year>
          . - Vol.
          <volume>3171</volume>
          :
          <string-name>
            <surname>Computational</surname>
            <given-names>Linguistics</given-names>
          </string-name>
          <source>and Intelligent Systems 2022: Proceedings of the 6th International conference on computational linguistics and intelligent systems (COLINS</source>
          <year>2022</year>
          ). Vol.
          <volume>1</volume>
          : Main conference, Gliwice, Poland, May
          <volume>12</volume>
          -13,
          <year>2022</year>
          . - P.
          <fpage>1282</fpage>
          -
          <lpage>1296</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <source>Deep Learning for Sentiment Analysis: A Survey</source>
          ,
          <source>Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</source>
          ,
          <volume>8</volume>
          (
          <issue>4</issue>
          ),
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Taboada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Brooke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tofiloski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Voll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stede</surname>
          </string-name>
          ,
          <article-title>Lexicon-Based Methods for Sentiment Analysis</article-title>
          ,
          <source>Computational Linguistics</source>
          ,
          <volume>37</volume>
          (
          <issue>2</issue>
          ),
          <year>2011</year>
          , pp.
          <fpage>267</fpage>
          -
          <lpage>307</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <source>Sentiment Analysis and Opinion Mining, Synthesis Lectures on Human Language Technologies</source>
          ,
          <volume>5</volume>
          (
          <issue>1</issue>
          ),
          <year>2012</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>E.</given-names>
            <surname>Cambria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hussain</surname>
          </string-name>
          , Sentic Computing:
          <article-title>A Common-Sense-Based Framework for ConceptLevel Sentiment Analysis</article-title>
          , Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E.</given-names>
            <surname>Cambria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Decherchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xing</surname>
          </string-name>
          , K. Kwok, SenticNet 7:
          <string-name>
            <given-names>A</given-names>
            <surname>Commonsense-Based Neurosymbolic AI</surname>
          </string-name>
          <article-title>Framework for Explainable Sentiment Analysis</article-title>
          ,
          <source>Proceedings of the 29th ACM International Conference on Information and Knowledge Management</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>105</fpage>
          -
          <lpage>114</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Basyuk</surname>
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vasyliuk</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>Approach to a subject area ontology visualization system creating /</article-title>
          / CEUR Workshop Proceedings. -
          <year>2021</year>
          . - Vol.
          <volume>2870</volume>
          :
          <source>Proceedings of the 5th International conference on computational linguistics and intelligent systems (COLINS</source>
          <year>2021</year>
          ), Lviv, Ukraine,
          <source>April 22-23</source>
          ,
          <year>2021</year>
          . Volume I: main conference.
          <source>- Р</source>
          .
          <fpage>528</fpage>
          -
          <lpage>540</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Xie</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Vovsha</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Rambow</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Passonneau</surname>
          </string-name>
          ,
          <source>Sentiment Analysis of Twitter Data, Proceedings of the Workshop on Languages in Social Media</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>30</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Dos</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gatti</surname>
          </string-name>
          ,
          <source>Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts, Proceedings of COLING</source>
          <year>2014</year>
          ,
          <source>the 25th International Conference on Computational Linguistics: Technical Papers</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>69</fpage>
          -
          <lpage>78</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          , Mining and Summarizing Customer Reviews,
          <source>Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          ,
          <year>2004</year>
          , pp.
          <fpage>168</fpage>
          -
          <lpage>177</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R.</given-names>
            <surname>Titov</surname>
          </string-name>
          ,
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>A Joint</given-names>
          </string-name>
          <article-title>Model of Text and Aspect Ratings for Sentiment Summarization</article-title>
          ,
          <source>Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>308</fpage>
          -
          <lpage>316</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Kim</surname>
          </string-name>
          , E. Hovy,
          <source>Determining the Sentiment of Opinions, Proceedings of the 20th International Conference on Computational Linguistics</source>
          ,
          <year>2004</year>
          , pp.
          <fpage>1367</fpage>
          -
          <lpage>1373</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>P. D.</given-names>
            <surname>Turney</surname>
          </string-name>
          ,
          <article-title>Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews</article-title>
          ,
          <source>Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <year>2002</year>
          , pp.
          <fpage>417</fpage>
          -
          <lpage>424</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>B.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vaithyanathan</surname>
          </string-name>
          , Thumbs Up?
          <article-title>Sentiment Classification Using Machine Learning Techniques</article-title>
          ,
          <source>Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2002</year>
          , pp.
          <fpage>79</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>E.</given-names>
            <surname>Riloff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wiebe</surname>
          </string-name>
          ,
          <article-title>Learning Extraction Patterns for Subjective Expressions</article-title>
          ,
          <source>Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>105</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Basyuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vasyliuk</surname>
          </string-name>
          ,
          <article-title>Peculiarities of an Information System Development for Studying Ukrainian Language and Carrying out an Emotional and</article-title>
          Content Analysis // CEUR Workshop Proceedings. -
          <year>2023</year>
          . - Vol.
          <volume>3396</volume>
          :
          <string-name>
            <surname>Computational</surname>
            <given-names>Linguistics</given-names>
          </string-name>
          <source>and Intelligent Systems 2023: Proceedings of the 7th International Conference on Computational Linguistics and Intelligent Systems</source>
          . Volume II: Computational Linguistics Workshop, Kharkiv, Ukraine,
          <source>April 20-21</source>
          ,
          <year>2023</year>
          . pp.
          <fpage>279</fpage>
          -
          <lpage>294</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Poria</surname>
          </string-name>
          , E. Cambria,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bajpai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hussain</surname>
          </string-name>
          ,
          <article-title>A Review of Affective Computing: From Unimodal Analysis to Multimodal Fusion</article-title>
          , Information Fusion,
          <volume>37</volume>
          ,
          <year>2017</year>
          , pp.
          <fpage>98</fpage>
          -
          <lpage>125</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>E.</given-names>
            <surname>Cambria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schuller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Havasi</surname>
          </string-name>
          ,
          <article-title>New Avenues in Opinion Mining and Sentiment Analysis</article-title>
          ,
          <source>IEEE Intelligent Systems</source>
          ,
          <volume>28</volume>
          (
          <issue>2</issue>
          ),
          <year>2013</year>
          , pp.
          <fpage>15</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Kim</surname>
          </string-name>
          , E. Hovy, Identifying and Analyzing Judgment Opinions,
          <source>Proceedings of the Human Language Technology Conference of the NAACL</source>
          ,
          <year>2006</year>
          , pp.
          <fpage>200</fpage>
          -
          <lpage>207</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>L.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>K. M. Haque</surname>
          </string-name>
          ,
          <article-title>Opinion Mining from Noisy Text Data</article-title>
          ,
          <source>International Journal on Document Analysis and Recognition</source>
          ,
          <volume>12</volume>
          (
          <issue>3</issue>
          ),
          <year>2009</year>
          , pp.
          <fpage>205</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>E.</given-names>
            <surname>Cambria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hussain</surname>
          </string-name>
          , Sentic Computing: Techniques, Tools, and Applications, Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Akcora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Bayir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Demirbas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ferhatosmanoglu</surname>
          </string-name>
          , Identifying Breakpoints in Public Opinion,
          <source>Proceedings of the First Workshop on Social Media Analytics</source>
          ,
          <year>2010</year>
          , pp.
          <fpage>62</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>Convolutional Neural Networks for Sentence Classification</article-title>
          ,
          <source>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>1746</fpage>
          -
          <lpage>1751</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Perelygin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chuang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Potts</surname>
          </string-name>
          ,
          <article-title>Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank</article-title>
          ,
          <source>Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>1631</fpage>
          -
          <lpage>1642</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Long</given-names>
            <surname>Short-Term</surname>
          </string-name>
          <string-name>
            <surname>Memory</surname>
          </string-name>
          ,
          <source>Neural Computation</source>
          ,
          <volume>9</volume>
          (
          <issue>8</issue>
          ),
          <year>1997</year>
          , pp.
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>