White-Box Adversarial Attacks Against Sentiment-
                         Analysis Models using an Aspect-Based Approach
                         Monserrat Vázquez-Hernández1, Ignacio Algredo-Badillo1 *, Luis Alberto Morales-
                         Rosales2 * and Luis Villaseñor-Pineda1
                         1 Departamento de Ciencias Computacionales, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE),

                         Tonantzintla 72840, Puebla, México
                         2 Facultad de Ingeniería Civil, CONAHCYT-Universidad Michoacana de San Nicolás de Hidalgo, Morelia 58000,

                         Michoacán, México

                                                                Abstract

                                                                Adversarial examples are deep learning model inputs that have been modified in their elements, by
                                                                means of small and often imperceptible perturbations, which are able to confuse the models in the
                                                                processing of the inputs and cause incorrect results. Among the issues still faced in adversarial example
                                                                design for text applications, there is an absence of methods that model adversarial example design
                                                                considering the characteristics of the task being addressed by DL models, as a consequence syntactically
                                                                incorrect inputs are generated, impacting on the imperceptibility of modifications and the effective
                                                                transfer of adversarial examples between models. In this work, we define a model for adversarial
                                                                example generation, particularly oriented to aspect-based sentiment analysis on a white-box attack; this
                                                                model considers aspect-based characteristics to drive the course of modifications. We evaluate our
                                                                proposal model against adversarial examples generated for document-level analysis, demonstrating its
                                                                effectiveness on impacting target model's results, making accuracy drops 20.90% and maintaining
                                                                semantic similarities of adversarial examples in 99% concerning original inputs.

                                                                Keywords
                                                                adversarial attacks, vulnerabilities, aspect-based, sentiment analysis 1


                         1. Introduction
                         Sentiment analysis (SA), concerns the use of text analysis and machine-learning techniques for
                         the automatic extraction and processing of users' opinions [1]. Sentiment-analysis systems are
                         an important tool that provides summarized information concerning the experiences, positive or
                         negative, that actual users have had about a product, service, or topic of interest.
                            Nowadays, sentiment analysis is used in many different areas to interpret users' opinions
                         better. With this, organizations can propose improvements in products or services to enhance the
                         experience of their users. For example, in the education field, the analysis of students' opinions
                         collected from interactive learning environments, assisted collaborative learning, institutional
                         digital media, or school administrative systems enables institutions to identify the sentiments
                         expressed by students through their opinions and thus propose improvements to student
                         experience [2]. Through students' comments, it is possible to track their learning behavior,
                         progression, and experience and thus enhance the learning process according to students' needs.
                         Understanding students' needs allows for transforming existing educational infrastructure to
                         benefit students most.


                         CISETC 2023: International Congress on Education and Technology in Sciences 2023, December 04–06, 2023,
                         Zacatecas, Mexico
                            mvazquez@inaoe.mx (M. Vázquez-Hernández); algredobadillo@inaoep.mx (I. Algredo-Badillo);
                         lamorales@conacyt.mx (L. A. Morales-Rosales); villasen@inaoep.mx (L. Villaseñor-Pineda)
                            0000-0002-0877-7063 (M. Vázquez-Hernández); 0000-0002-4748-3500 (I. Algredo-Badillo); 0000-0003-1294-
                         9128 (L. A. Morales-Rosales); 0000-0003-1294-9128 (L. Villaseñor-Pineda)
                                                           © 2023 Copyright for this paper by its authors.
                                                           Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                                           CEUR Workshop Proceedings (CEUR-WS.org)
                                CEUR
                                            ht
                                             tp:
                                               //
                                                ceur
                                                   -ws
                                                     .or
                                                       g
                                Works
                                    hop     I
                                            SSN1613-
                                                   0073
                                Pr
                                 oceedi
                                      ngs


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
    According to information needs, sentiment analysis can be performed at different granularity
levels: i) document, ii) sentence, or iii) aspect1. The analysis at document-level refers to the
positive or negative classification of a full text [3], while at analysis sentence-level the objective
is to analyze each sentence in a text to classify them [4]. Finally, the aspect-level analysis (or
aspect-based sentiment analysis ABSA) seeks to independently determine the opinion expressed
for each mentioned aspect within an opinion [5]. In many cases, the analysis at document or
sentence-level does not provide specific details about particular aspects. For example, a negative
document about an entity does not mean that the user has negative opinions about all aspects of
this entity [6]; given this situation, it is necessary to work at a lower granularity, hence the
interest and importance of aspect-level analysis. Table 1 illustrates the differences when
performing the sentiment analysis at sentence and aspect levels.

Table 1
Aspect-based sentiment analysis. In bold are indicated the evaluated aspects


   Aspect-level analysis, being a more detailed task, requires methods that accurately identify the
opinion-terms related to each evaluated aspect to provide accurate information about current
users' attitudes. In recent years, the use of Deep Learning (DL) models to address aspect-level
analysis has gained great popularity; through DL models, it is pursued to improve previous
results and increase the confidence of its users, although this does not always turn out to be true.
Several research works [7, 8, 9, 10, 11, 12] have demonstrated that DL models for applications
using images or text can be effectively fooled by strategically-modified inputs denominated as
adversarial examples.
   Adversarial examples are modified inputs generated to cause a negative impact on models'
results. The adversarial examples are generated by adding some small and subtle modifications
to the original inputs to confuse the models on inputs' understanding and thus cause their
incorrect classification (according to the classification task). Szegedy et al. [12] introduced
adversarial examples when they studied the stability of state-of-the-art Deep Neural Networks
(DNNs) for image classification in the face of modified inputs. Their work performed small pixel-
level modifications to input data and observed that DNNs could be fooled by these modified
inputs even if the human perception of data is not affected [13]. Based on the adversarial example
idea, Jia and Liang in [11] consider the adversarial-example design to evaluate DNNs models for
a text-based task. In their work, they experimented by inserting text fragments at the end of
inputs without changing the original text, and they observed that DNNs text-models could also be
fooled by adversarial examples. Since then, different works for the text-based task, such as [7, 8,
9, 10], have demonstrated that models can be fooled by making changes at character, term or
sentence level by adding, deleting, substituting, or swapping text parts. Table 2 illustrates
modified text-inputs by substituting a term with its synonym.
   Previous works oriented to the Sentiment Analysis task [13, 6, 8] have proven to fool models
effectively via adversarial examples. Although these works have impacted the accuracy results,
they have mainly focused on addressing the task at the document-level and have not correctly
dealt with aspect-based characteristics. Recently, Ekbal et al. in [15], proposed a method to
generate adversarial examples oriented to aspect-based sentiment analysis, integrating specific

1 The term “aspect” is used to name components, characteristics or attributes of a product, service or entity.
aspect-level characteristics to preserve opinion semantics. Their contribution relies on not
modifying the terms of the aspect evaluated, so modifications to generate adversarial examples
are made on the rest of the terms present in the opinion. Although the proposed attack
contributes to achieving higher semantic similarity and grammatical correctness, it is assumed
that only one aspect is evaluated within an opinion, which is not necessarily true (refer to Table
1). To generate effective adversarial examples, we consider that modifications to generate
adversarial examples must be oriented to aspect-level in a particular way based on their
characteristics. As table 1 shows, each aspect within the opinion directly relates to some opinion
terms. On the aspect-level analysis, it is necessary to define this aspect-terms relation to
determine the user's sentiment expressed correctly. Therefore, to perform input modifications
and generate adversarial examples, the aspect-terms relation has to be identified, and changes
have to be made to infringe the relation or change the aspect's sentiment. Suppose modifications
are performed without considering aspect-terms relation; in that case, irrelevant terms can be
modified without having the desired effect of adversarial examples, impacting the
imperceptibility of modifications, the semantics and syntax of the texts, and the message's
readability.

Table 2
Adversarial examples with a synonym term change. x represents the original input and x' its
adversarial example. In bold are indicated the modified terms


    This work proposes a model for generating adversarial examples oriented to aspect-level
analysis to deal with aspect-level characteristics correctly. Based on our proposed model, we
define an adversarial attack to generate adversarial examples that are particularly oriented to
aspect-based sentiment analysis models, identifying the terms directly related to the evaluated
aspects and accordingly modifying them. We evaluate our adversarial attack against strategies
previously applied for sentiment analysis which have proven to be effective in misleading DL
models. Through achieved results, we show our attack's effectiveness since it outperformed the
impact of the document-level adversarial attack by a 12.8% difference, maintaining a 99%
semantic similarity between the original input and its adversarial example created. Moreover,
our proposed attack shows its generality and transferability across contexts to be evaluated on
different data sets, achieving context independence and maintaining the negative impact on
model results. Summarizing, the main contributions of this work are:
    •   A model of aspect-based adversarial examples in which the modifications to be performed
        are conducted by considering aspect-level properties.
    •   A new strategy for deploying an adversarial attack, especially suited to aspect-level
        sentiment analysis to correctly deal with aspect-terms relation.
    •   An adversarial attack that achieves a higher semantic similarity and input's readability
        with fewer modifications.
    •   An adversarial attack that offers generality and transferability across different context
        data sets, maintaining the negative impact on aspect-level model's results and semantic
        similarity and input's readability.
2. Preliminaries
Before introducing the literature review and our proposal attack, this section includes
preliminary knowledge related to adversarial attacks on deep learning models, particularly for
text-based tasks, covering current techniques and strategies to perform text-modifications.

    2.1 Adversarial examples

   An adversarial attack consists of generating adversarial examples to fool the target model and
negatively impact on its performance [13]. An adversarial example x' is an input generated by
adding a perturbation n to an original input x of the target model, i.e. x' = x + n. A robust model
should continue to classify the correct class y to x', while a victim model will have a high
probability of incorrectly classifying x'. Zhang et al. in [13], presents a definition of adversarial
examples and proposes the following formalization.

                                     𝑓(𝑥) = 𝑦, 𝑥 ⋲ 𝑋,                                          (1)
                                  𝑥′ = 𝑥 + 𝑛, 𝑓(𝑥′) ≠ 𝑦
                                    𝑓(𝑥′) = 𝑦′, 𝑦′ ≠ 𝑦

   where n is the worst-case perturbation. The goal of the adversarial attack can be deviating the
label to an incorrect one (𝑓(𝑥′) ≠ 𝑦) or specified one (𝑓(𝑥 ! ) = 𝑦′).

    2.2 Threat model
   In [16], the crucial aspects to be considered when designing an adversarial attack are
discussed. These aspects are described as follows:
   •   Model Knowledge. Adversarial examples can be designed under a black-box, white-box
       or grey-box scenario. The black-box attacks are performed when the details of the target
       model are unknown. Generally, adversarial examples are generated by accessing test data
       or querying the target model and verifying an output change. In contrast, the white-box
       attack relies on knowledge of the technical details. Lastly, grey-box strategy is a half-way
       point between black-box and white-box scenarios.
   •   Target. Adversarial examples can be generated to change the output prediction to: i) look
       for a specific class result (targeted) or ii) cause errors without any particular class
       (untargeted).
   •   Granularity. Refers to the detail level at which modifications are performed. In text-
       applications, adversarial examples can be generated at the character, term or sentence
       level, and the techniques to modify input data can be summarized as replace, delete, add
       or swap.
   •   Motivation. Adversarial examples design is motivated by two objectives: attack or
       defense. Attack aims to examine the robustness of the target model, while defense uses
       the knowledge of adversarial examples to strengthen it.

   To identify the best criteria to design an adversarial attack, it is necessary to develop, test and
analyze different modifications at different levels to determine which will effectively fool the
target model. For this reason, we experimented with designing different adversarial attacks
according to Model Knowledge.

    2.3 Strategies

   To generate adversarial examples, we can apply different strategies to change specific terms
of the input or the complete input according to the granularity level. Text-based strategies to
modify inputs include the following:
   •   Concatenation. This strategy consists of adding a sentence at the end of a text called a
       distractor-text to confuse the model without changing the semantics of the text [11].
   •   Edit. The attacks perform modifications to input data in two ways: i) Synthetic, the
       characters in a word or term are reordered via swapping, middle random (random
       characters are exchanged except the first and the last one) and fully random (all the
       characters are randomly rearranged). ii) Natural in which the spelling errors in the
       original data are exploited. Advanced applications carry out modifications as: Random
       Swap by making an exchange of neighboring terms, Stopword Dropout by randomly
       removing empty words, Paraphrasing substituting terms by their paraphrase, Grammar
       Errors in which, for example, modifications are made changing the conjugation of a verb,
       Add Negation and Antonym strategy.
   •   Paraphrase-based. Carefully produces a paraphrase of the original input with correct
       syntax and grammar.
   •   Substitution. This strategy attempts to reproduce the target model's operation in a local
       model to limit the requests to the victim model [17]. Potential adversarial examples that
       could confuse the target model are created and evaluated in the local model. If a potential
       adversarial example achieves to confuse the local model, it is considered an adversarial
       example.

   2.4 Modifications control

   During adversarial example generation, that is to say when inputs are modified, it is necessary
to measure and control modifications to keep them to a minimum size. Moreover, after input
modification, it must measure the modifications' size to ensure their imperceptibility and the
semantic of the text. Usually, adversarial examples are measured by the distance between the
original data (or clean data) x and its adversarial example x'.
   •    Grammar and Syntax measurement. Ensuring correct grammar and syntax is necessary
        to make adversarial examples undetectable. Strategies such as perplexity measure,
        paraphrase control, and grammar and syntax checkers have been proposed to measure
        grammar and syntax.
   •    Semantic-preserving measurement. The semantic similarity/distance measurement is
        performed on word vectors using measures of distances (such as Euclidean distance) and
        similarity (such as cosine similarity).
   •    Edit-based measurement. Measuring the number of edits (modifications) quantifies the
        minimum changes from one text to the next. Different definitions of edit distances use
        different operations, for example: Jaccard similarity coefficient, Word Mover’s Distance
        (WMD) or Perturbation ratio.

   2.5 Evaluation metrics
   For evaluating the performance of sentiment analysis systems, obtaining a set of metrics to
measure their effectiveness is necessary. Therefore, we use the following metrics to evaluate
sentiment analysis systems' performance:
   •   Success rate. The success rate is the most direct and effective evaluation criteria [18].
       The attack success rate indicates the percentage of successful adversarial examples and
       the percentage of unsuccessfully attacked inputs. This measure provides insight into the
       susceptibility of a model to the designed adversarial examples.
   •   Model Robustness. Adversarial attacks are designed to affect the performance of models
       concerning the correct classifications. The robustness of DL models is related to the
       classification accuracy Before-Attack-Accuracy (BA) and how it is affected by adversarial
       examples After-attack-accuracy (AA).
3. Related work
   Table 3 summarizes the adversarial attacks for sentiment analysis models reviewed.
According to the threat model characteristics proposed in [16], we indicated for each work: the
model knowledge, granularity, target (objective), and strategy applied. Additionally, we included
the attacked DNN model, the considered metric to evaluate the effectiveness of the attack, and
the modification control applied. In the following section, we describe these works in detail.

   Table 3
   Adversarial attacks for sentiment analysis models
    Work    Model           Granularity   Targeted     Strategy     DNN          Evaluation     Modification Control
            Knowledge                                               Model        Metric
     [14]   White-box   Character,         Targeted       Edit      CNN          Model                        -
                        Sentence                                                 robustness

     [7]    White-box   Character, Term   Untargeted     Hybrid     CNN          Model          Word Mover’s Distance
                                                                                 robustness

     [9]    White-box   Character, Term   Untargeted      Edit      LSTM, CNN    Success rate   Edit distance, Jaccard
                                                                                                similarity,   Euclidean
                                                                                                distance and Semantic
                                                                                                similarity
     [8]    White-box   Term              Untargeted      Edit      CNN          Success rate   Perplexity,    Semantic
                                                                                                similarity
     [10]   Black-box   Term              Untargeted      Edit      LSTM         Success rate                -

     [19]   Black-box   Term              Untargeted   Paraphrase   BIDAG,       Success rate   Semantic similarity
                                                                    Visual7W,
                                                                    fastText
     [20]   Black-box   Character, Term   Untargeted     Hybrid     CNN, LSTM    Model                        -
                                                                                 robustness
     [21]   Black-box   Term, Sentence    Untargeted     Hybrid     CNN, LSTM,   Model          Semantic similarity
                                                                    BERT         robustness

     [22]   Grey-box    Term              Untargeted      Edit      LSTM         Model          Semantic      similarity,
                                                                                 robustness     Fluency


    3.1 Adversarial attacks for sentiment analysis models
    The principal objective of sentiment analysis models is to obtain an effective set of terms that
uniquely identify different sentiments (positive, negative, or neutral), contributing to classifying
an opinion. Some authors refer to these terms as valuable words since they have a crucial role in
the final classification [23, 24]. Recent research seeks to determine with high precision the terms
that contribute to the correct input classification and use them to create adversarial examples
[25]. Liang et al. [24] presented a white-box adversarial attack denominated TextFool. TextFool
is a targeted attack that uses the concept of FGSM (Fast Gradient Sign Method) to approximate
the contribution of terms in a text to identify those that have a high impact on the input
classification. In the TextFool method, the adversarial examples are created by implementing
three types of modifications at the sentence level: insert, modify (in which some characters are
replaced), and delete. Gao et al. [20] proposed the DeepWordBug method to generate small text
perturbations in a black-box scenario. In this method, the Replace-1 Score (R1S), Temporal Head
Score (THS), Temporal Tail Score (TTS) and Combined Score (CS) punctuation strategies are
proposed to identify key terms that, if are modified, cause the classifier to make an incorrect
prediction. Character-level transformations are performed on the most relevant terms to
minimize the edit distance of the perturbation from the original input.
    The main difficulties in generating adversarial texts include: i) that the input space is discrete,
making it challenging to accumulate small noises in the text-inputs, and ii) measuring the quality
of adversarial texts to preserve the modifications imperceptible. Gong et al. [7] proposed a white-
box scenario, where the discrete space is addressed by generating adversarial texts in the
embeddings space against a CNN model. Furthermore, the word mover's distance (WMD) is
implemented to evaluate the similitude of the generated adversarial texts with original inputs. Li
et al. [9] presented a method called TextBugger, which is presented as a perturbation constraint
to evaluate the quality of adversarial texts generated in a white-box environment using different
similarity measures: edit distance, Jaccard similarity coefficient, Euclidean distance, and cosine
similarity. Tsai et al. [8] proposed a white-box method called Global Search; they made simple
modifications by adding spelling error noises to preserve the quality of the modifications under
the idea that humans consider this type of errors as normal; additionally, a more sophisticated
approach called Greedy Search is proposed, in which the k nearest neighbors of each word in an
opinion are chosen to be replaced and to control the modifications. The perplexity is implemented
to measure the degree of distortion (modification) of the generated adversarial examples.
    On the other hand, one challenge to be faced when generating adversarial texts is preserving
the correct semantics and syntax to maintain the original input's legibility. To deal with this,
Alzantot et al. [10] used a population-based optimization algorithm to generate semantically and
syntactically similar adversarial examples to try to fool sentiment analysis and textual entailment
models. In the first stage, the main value words are identified, and of each word, the nearest N
synonyms neighbors which could replace it are searched into the dataset. Then, for selecting the
correct synonyms to replace a word, the Google 1 billion words language model is used to discard
those that are less frequent in the context of the text. Finally, from the remaining terms, it is
selected the one that contributes more to the sentiment classification when substituting the
original term. Jin et al. [21] proposed the TextFooler method. This method uses two fundamental
tasks of Natural Language Processing to generate adversarial examples: i) text classification and
ii) textual entailment. According to the authors, using these tasks allows the preservation of the
semantic and grammatical content, as long as the correct human classification. Xu et al. [22]
presented a gray-box adversarial attack and defense framework for sentiment classification,
which addresses issues of differentiability, label preservation, and input reconstruction for
adversarial attack and defense in a unified framework.
    Up to now, most current works address different tasks by applying global strategies to modify
inputs. This does not necessarily provide a correct solution since specific challenges in each task
must be handled for a correct modification process. Although previous adversarial example
attacks focusing on sentiment analysis have fooled models and reduced the accuracy of the
results, these works have focused on addressing the sentiment analysis at the document-level
and have not modeled the problem to deal with aspect-level characteristics. A recent attempt was
made to attack an aspect-level classifier by Ekbal et al. [15] is proposed; this method integrates
aspect-level characteristics to generate adversarial examples. In this method, given an evaluated
aspect within an opinion, the terms that are part of the aspect are not modified but for all other
terms in the opinion, it is intended to replace them by their synonym. The terms to be modified
are selected according to their influence on the classification, to determine the influence of a term,
it is masked with a special token and checked with the model to be attacked to see if it influences
the classification. The proposed attack contributes to achieving higher semantic and syntax
correctness; however, it is assumed that only one aspect will be evaluated, which is not
necessarily true. By nature, different aspects can be included within an opinion, and different
attitudes can be expressed for each of them. So, to determine a user's opinion towards aspects, it
is necessary to identify the aspect-terms relation; that is to say, to identify the correspondent
opinion terms related to each aspect which determine the positive or negative opinion by aspect
(refer to Table 1); later for this identification, modifications to generate adversarial examples
should be made on these terms.
    We consider that an ideal adversarial-example design for aspect-level sentiment analysis
should combine aspect-level and adversarial example characteristics to perform modifications
on inputs and thus achieve task-oriented adversarial examples. First, to accomplish task-oriented
adversarial examples, it will be necessary to correctly determine a set of terms that uniquely
identify different sentiments (positive, negative, or neutral) contributing to classifying an
opinion. Second, for each term, it will be necessary to establish the possible modifications N it
should undergo, taking care of each one could be performed preserving the correct opinion
semantics and syntax and successfully fool models.
4. Aspect-based adversarial examples
In contrast to the reviewed works, we aim to approach aspect-level sentiment analysis (or aspect-
based. Hereafter, we will use aspect-based to identify the analysis at the aspect level). Given the
nature of the aspect-based sentiment analysis, to generate adversarial examples, terms to modify
need to be selected according to the evaluated aspects within an opinion. For example, according
to Table 1, each mentioned aspect is related to specific terms by which it is possible to determine
the expressed user’s sentiment. Based on this main feature, the formalization of adversarial
examples (refer to Eq. 1) must be modified to consider the aspect-terms relation and thus
generate aspect-based adversarial examples.

   We defined the aspect-based adversarial examples model as follows:
   •   Given an opinion x consisting of a set of terms T (and each t
       ∊ 𝑇 can be uni-gram or n-gram words) and a set 𝐴 of different aspects mentioned within
       x. For each aspect 𝑎 ∊ 𝐴 there is a term 𝑡 ∊ 𝑇 particularly related to 𝑎i which allows to
       understand and classify the expressed user' sentiment 𝑦"#
                                       M(x, 𝑎# , t "# ) = 𝑦"#                                   (2)


   •   To identify the term 𝑡 ∊ 𝑇 particularly related to 𝑎# and set the relation aspect-term t "# ,
       the semantic proximity between aspect 𝑎# and terms within x have to be computed; this
       proximity can be expressed as:
                                        SP(𝑎# , t # ) = [0,1]                                   (3)


       A SP(𝑎i, ti) ≈ 0 will be mean that 𝑡# is not related to 𝑎# while SP(𝑎i, ti) ≈ 1 indicates a
       relation between 𝑡# and 𝑎# .

   •   The goal of aspect-based adversarial examples is to generate an adversarial example x'
       via the modification of 𝑡"# generating 𝑡′"# and causing that M performs a 𝑦"#
       misclassification:
                                       M(x′, 𝑎# , 𝑡′"# ) ≠ 𝑦"#                                  (4)

   At the same time, x' should satisfy the following properties:
   •    To generate 𝑡′"# each possible modification to 𝑡"# should maintain the semantic proximity
        to the original term, i.e. SP(𝑡# , 𝑡 ! # ) ≈ 1
   •    The modified input x' should be semantically similar to x. For this, the semantic similarity
        between x and x' is calculated and controlled.

   The hypothesis behind our proposal lies on, by focusing on aspect-term relation, modifications
to generate adversarial examples to negative impact on model’s results, will be performed on the
minimum necessary terms that effectively support aspect-sentiment which will contribute to
perform the fewer modifications, maintaining modifications imperceptibility, inputs semantic
and to allow the transfer of the attack among aspect-level models.

   4.1 Adversarial attack
    To evaluate the effectiveness of our proposal, we designed an adversarial attack in a white-
box scenario to observe its performance. Figure 1 illustrates our aspect-based adversarial attack
design (denominated as ABAA). The following sections describe the adversarial attack designed
in this work and the achieved results.
   Figure 1: ABAA: Aspect-Based Adversarial Attack overview

    4.2 Adversary’s knowledge
   Our adversarial attack takes as a target model our previous approach: Sentiment Analysis
using Specialized Aspect-Oriented Lexicons [26], which proposes a term weighting scheme for
aspect-based sentiment analysis. This approach takes as input a set of sentiment-oriented
lexicons (one by sentiment, i.e., positive, neutral, negative) to model in a single vector each
sentiment according to the average of the vectors of its terms and thus give a weight to each term
within an opinion according to its semantic closeness concerning single vectors lexicons with this,
terms pointing to sentiment in an opinion are highlighted allowing the sentiment classification.
To evaluate the weighting scheme, the target model implements a CNN architecture using the
SemEval2 restaurant dataset. The restaurant dataset consists of two subsets: 1) a training set with
2,507 reviews and 2) a test set with 889 reviews. In both sets, the customer reviews include
annotations identifying the aspects mentioned and its expressed sentiment polarity.

    4.3 Aspect-Based adversarial examples design

    Taking the sentiment-oriented lexicons and training set, adversarial examples are generated
modifying reviews on test data, previous to training target model without affecting test set (refer
to Fig. 1). To create aspect-based adversarial examples, the terms’ modification was performed
as follows:
    Given an opinion x and one of the evaluated aspects within it ai:
    1. Define the set of terms in aspect 𝑎$%&'( which includes the terms of the aspect 𝑎# and the
         set of opinion terms 𝑥$%&'( which include the rest of the terms in opinion without the
         terms in 𝑎$%&'( . Let's consider the example “food is tasteless but the support staff was
         friendly”. In this example, an evaluated aspect is support staff, thus the set 𝑎$%&'( =
         (𝑠𝑢𝑝𝑝𝑜𝑟𝑡, 𝑠𝑡𝑎𝑓𝑓) and the set 𝑥$%&'( = (𝑓𝑜𝑜𝑑, 𝑖𝑠, 𝑡𝑎𝑠𝑡𝑒𝑙𝑒𝑠𝑠, 𝑏𝑢𝑡, 𝑡ℎ𝑒, 𝑤𝑎𝑠, 𝑓𝑟𝑖𝑒𝑛𝑑𝑙𝑦).
    2. Define a unique vector to represent the evaluated aspect. This unique vector is expressed
            OOOOOOOO⃗(𝑎# ). We defined OOOOOOOO⃗
         as 𝑒𝑚𝑏                        𝑒𝑚𝑏(𝑎# ) as the average of the embedding3 terms in 𝑎$%&'( :
                                                         1                                            (5)
                                     OOOOOOOO⃗(𝑎# ) =
                                     𝑒𝑚𝑏                      R 𝑒𝑚𝑏(𝑡)
                                                        |𝑎# |
                                                           ∀ $∈"!


   3.    To only modify the terms associated with the aspect under evaluation, we identify terms
         in 𝑥$%&'( whose semantic proximity is equal or above to a threshold. The semantic


2 https://semeval.github.io/
3 For  representing terms and measuring semantic closeness and representation, we use the pre-trained GloVe
distributed embeddings on Twitter 200d.
       proximity SP is calculated by the cosine similarity between the embedding term 𝑡# and
       OOOOOOOO⃗(𝑎# ):
       𝑒𝑚𝑏

                               𝑆𝑃(𝑎# , 𝑡# ) = cos( OOOOOOOO⃗
                                                     𝑒𝑚𝑏(𝑎# ), 𝑒𝑚𝑏(𝑡# ))                        (6)
                                             𝑆𝑃(𝑎# , 𝑡# ) ≥ 𝛽

   Then, filtered terms are modified by applying a replace or delete technique as follows:
   •   Replace. Replace in opinions the terms. For this, a list of synonyms by the term is
       obtained, and their semantic closeness is measured. Semantic closeness is defined as the
       cosine similarity between the original term and synonym. The synonym to replace the
       term can be selected by: i) the most semantic closely or ii) applying a random selection.
   •   Delete. Filtered terms are deleted in the opinion.

  Modifications were tested one by one, subsequently, a hybrid scenario was tested. In the
hybrid scenario, the modification to implement is randomly selected.

5. Experiments and results
Target model implements a CNN architecture using the SemEval restaurant dataset, which
includes a training set and a test set with customer reviews with annotations identifying the
aspects mentioned and the sentiment polarity of each aspect. We take advantage of the target
model's technical details to drive modifications to generate adversarial examples. Specifically, we
implemented the edit strategy to modify the terms in the sentiment-oriented lexicons since these
are the most important terms for the target model, allowing it to determine the sentiment polarity
for each aspect in an opinion. To filter terms to be modified, we consider
𝛽 = {0.2, 0.3, 0.4, 0.5, 0.6}. We empirically defined 𝛽 considering that terms with semantic
proximity close to 1 have the same direction as the vector of the aspect term and, therefore, are
strongly associated with the aspect evaluated and determine the user’s sentiment expressed.

   5.1 Reference adversarial attack
   Prior to designing aspect-oriented adversarial examples, we designed a reference adversarial
attack using the sentiment-oriented lexicons and training dataset. The reference attack consists
of modifying opinion terms if these terms are in the sentiment-oriented lexicons without
validating if they are related to the aspect being evaluated simulating an attack at document-level.
The obtained results from this document-level reference attack, will serve to observe the
potential and effectiveness of our proposal, which will allow us to determine the feasibility of
conducting experiments in different scenarios (datasets, target models, modification technique,
etc.) in order to compare it against current work that addresses the design of adversarial
examples for aspect-level analysis.
   In this reference attack, modifications were performed as follows:
   •    Replace. Taking advantage of sentiment lexicons used by the target model, all the terms
        in opinions are modified if they are included in the sentiment lexicons. A list of synonyms
        is obtained by each term, and their semantic closeness is measured. The synonym to
        replace a term lexicon is selected by: i) the most semantic closely or ii) applying a random
        selection.
   •    Delete. Sentiment-oriented lexicon terms contained in opinions are deleted.

   As aspect-based adversarial attack, the modifications were tested one by one, and
subsequently a hybrid scenario was evaluated.
   5.2 Evaluation metrics
    To measure the effectiveness of our proposal attack, we calculate i) Before-attack accuracy
(BA) and After-attack-accuracy (AA), the before-attack-accuracy is calculated when any
modification on training dataset is made, and After-attack-accuracy is calculated after opinions
in training set are modified; ii) Attack success rate (SR), the percentage of adversarial examples
that can successfully attack the target model; iii) Semantic similarity (SS): this is computed
between the adversarial and actual sentence using the cosine similarity metric.


   5.3 Results and analysis

   Firstly, table 4 presents the results from the target model without any input modification and
the results achieved when reference attack is applied. The results were calculated by executing
ten times the target model; mean and standard deviation (± std) are shown. To evaluate the
imperceptibility of generated adversarial examples, the semantic similarity was measured via the
cosine similarity between the original input x and the modified input x'.

   Table 4
   Reference adversarial attack results by applied technique. It is presented BA: Before-attack-
accuracy, AA: After-attack-accuracy and SS: Semantic similarity. In bold, the best results obtained
are marked.
                    Technique                 BA                AA             SS
                    Target model         82.60 ± 0.46            -              -
                    Replace                    -          74.48 ± 0.67       0.84
                    Random replace             -          79.28 ± 0.37       0.81
                    Delete*                    -          74.50 ± 0.60       0.65
                    Hybrid                     -          78.86 ± 0.47       0.73

   According to the obtained results in table 4, we consider as reference those results achieved
by the delete technique since it has the greatest impact on target model accuracy, making it drop
from 82.60% to 74.50% percent. In terms of attack success rate, the target model resisted for 607
modified instances, leading to a success rate of 9.806% (66/673) and accuracy after attack of
74.47% (607/815). Although deleting a term means losing semantics, syntax, and readability in
the original inputs, the model reaches a semantic similarity of only 0.65% and the reference
attack does not further mislead the target model.
   Move on ABAA attack, Table 5 presents the accuracy after attack (AA) achieved by our
proposed strategy under the different modification techniques implemented on the training
dataset. The results are organized as follows: by each technique, we evaluate the selection of
terms to be modified according to their semantic proximity to the evaluated aspect; as previously
mentioned, this proximity is calculated by the cosine distance between the unique vector of the
aspect's terms and the term under evaluation. For terms evaluation, we consider the 𝛽 =
{0.2, 0.3, 0.4, 0.5, 0.6} to modify just terms with semantic proximity equal to or above 𝛽. The best
results by the applied technique are marked in bold after applying the 𝛽 threshold. The results
marked with * indicate the best results according 𝛽 to among the different techniques. Finally,
the results with ** indicate the best-achieved results by the ABAA attack. The results were
calculated by executing ten times the target model; mean and standard deviation (± std) are
shown. The semantic similarity (SS) was measured by the cosine similarity between the original
input x and modified input x' to evaluate the imperceptibility of modifications on generated
adversarial examples.
Table 5
ABAA: Aspect-Based Adversarial attack results. It is presented AA: Accuracy-after-attack and SS:
Semantic similarity. In bold, best results by applied technique are marked. Results marked with *
indicate the best results according to while results with ** indicate the best achieved results
                ABAA Random replace              ABAA replace               ABAA delete             ABAA Hybrid
                 AA               SS           AA            SS            AA          SS         AA           SS
   0.2       63.50 ± 0.74        0.86   61.88 ± 0.99*       0.93     63.28 ± 0.96     0.87    63.75 ± 0.85    0.90
   0.3       63.50 ± 0.56        0.88   62.08 ± 0.90*       0.94     63.53 ± 0.69     0.89    63.58 ± 0.51    0.99
   0.4       63.33 ± 0.68        0.91   62.23 ± 0.93*       0.95     63.32 ± 0.74     0.91    64.03 ± 0.59    0.91
   0.5       62.66 ± 0.66        0.95   62.46 ± 0.47        0.97     62.31 ± 0.70*    0.96    63.28 ± 0.65    0.92
   0.6       61.94 ± 0.69        0.99   61.70 ± 0.37**      0.99     61.96 ± 0.59     0.99    63.56 ± 0.65    0.93
 Target model 82.60 ± 0.47
 Reference attack 74.50 ± 0.60


   The obtained results from ABAA evidence the relevance of the proposal since it shows higher
effectiveness in fooling the target model, causing that accuracy target model drops by 20.90%. In
terms of attack success rate, the ABAA attack outperforms reference results previously obtained.
After the ABAA attack, the model resisted for 503 modified instances, leading to a success rate of
25.26% and an accuracy after attack of 61.70%.
   Figure 2a illustrates the effect of each technique implemented on the accuracy model. It is
possible to appreciate that the replace technique has a greater negative impact on the target
model's results. In the same way, we can observe that applying a higher to filter out the terms to
be modified, may be able to fool the target model with higher effectiveness. By other hand, figure
2b illustrates the positive effect of aspect-based adversarial examples on preserving the semantic
similarity between original inputs x and the adversarial examples generated x'. In contrast to the
reference attack, it is evident that modifying only terms related to aspects evaluated makes it
possible to maintain the input's readability due to minimal modifications.


         a) Accuracy by technique according to                  b) Semantic similarity by technique according to
Figure 2: Comparison of obtained results from ABAA attack according to 𝛽


    5.4 Discussion

    Figure 3 permits to compare accuracy ABAA results against reference attack results. With
ABAA, our best result is achieved with replace technique filtering the terms to be modified with
𝛽 = 0.6 (refer to Fig. 2). Through this comparison, we can see the ABAA attack's effectiveness
since it outperformed the impact of the document-level reference adversarial attack by a 12.81%
difference. Via figure 4, it is possible to observe the positive effect that our proposal provides to
maintain the modifications as minimal as possible, achieving a semantic similarity of up to 0.99%
between the original input and its adversarial example created.
   Figure 3: Comparison ABAA results against target model accuracy and reference attack results


  Figure 4: Comparison achieved ABAA semantic similarity by technique against reference
semantic similarity

   To illustrate the functionality of the filtering of terms according to the evaluated aspect, in
table 6 are presented adversarial examples generated from reference attack and ABAA attack
applying the delete technique. By means of these examples, it is possible to observe the positive
impact of ABBA to maintain the input readability thanks to the minimal number of modified
terms.

   Table 6
   Adversarial example generated via reference attack and ABAA attack.
              Opinion                    Adversarial examples from          Adversarial example from ABAA attack
                                  reference attack by delete technique      by delete technique with 𝜷 = 0.6
 everyone raved atmosphere      atmosphere                               everyone raved atmosphere elegant
 elegant rooms absolutely                                                rooms absolutely
 great vibe lots people         great                                    great vibe lots people
 very cozy and warm inside      and warm inside                          very cozy and warm inside
 nice try snag outside table    nice table                               nice try snag outside table
 like ambience dark original    like ambiance                            like ambience dark original

   Observing the results obtained, and continuing with evaluation of our proposal's potential, to
evaluate its generality and transferability, we performed another experiment for the same target
model but using a different domain data set. In this case, we use the English laptop dataset from
SemEval, as the restaurant dataset, the dataset consists of a training set and a test set, both
containing customer reviews with annotations identifying the aspects mentioned and the
sentiment polarity of each aspect. In table 7 are presented the obtained results in this evaluation.
The results were calculated by executing ten times the target model; mean and standard deviation
(± std) are shown.
   Table 7
   Aspect-Based Adversarial Attack results by applied technique on Laptop dataset. It is presented
BA: Before-attack-accuracy, AA: After-attack-accuracy and SS: Semantic similarity. In bold, the best
results obtained are marked
              Technique                    BA                             AA                  SS
     Target model                      77.48 ± 0.66                        -                   -
     ABAA Replace                           -             0.4       60.248 ± 3.11           0.786
     ABAA Random replace                    -             0.5       60.011 ± 3.16           0.788
     ABAA Delete                            -             0.4       60.110 ± 3.25           0.788
     ABAA Hybrid                            -             0.3       60.110 ± 3.88           0.776

   As can be noticed, our attack significantly impacts the model's results. From this, we proved
the generality and transferability across contexts of the aspect-based adversarial examples
designed, proving its context independence since the ABAA attack maintains the same negative
impact on the model's accuracy without any additional adjustment. The transferability of
adversarial examples is an outstanding feature that adversarial strategies have to demonstrate
when they are transferred from one model to another, maintaining their effectiveness. Until now,
the lack of approaches that address tasks in a particularized way has prevented the effective
transference of attacks among models, even when these attacks are carried out on the same task.
Nevertheless, thanks to the results obtained during the evaluation of our proposal in different
datasets, it is possible to observe the positive effect of particularizing the design of adversarial
examples by considering the characteristics of the task; in our case, the aspect-level sentiment
analysis can affect different domains without showing dependence on a specific context.

  After evaluating our aspect-based adversarial attack against the reference attack and
observing the performance of our proposal, the principal remarks are:
  •    Document-level techniques fail to fool the target model effectively, even though the
  modifications created considerable changes and impacted on input readability, semantics, and
  syntax (refer to Table 6). Through achieved results, we show our attack's effectiveness since
  it outperformed the impact of the document-level adversarial attack by a 12.81% difference.
  •    Since reference attack's modifications are not particularized to the ABSA task, we
  observed that the techniques do not consider the relation of terms and aspects, so the semantic
  connection throughout the text is not broken and, in a sense, there are no modifications for
  target model. Furthermore, due terms to be modified are not selected according to evaluated
  aspects, input terms (not related to aspects) are unnecessarily modified, impacting on
  modification's imperceptibility and, as consequence, on input readability. Making a
  comparison between adversarial examples from reference attack and ABAA attack (refer to
  Table 6), we observe that our proposal maintains a 99% semantic similarity between the
  original input and its adversarial example created.
  •    The obtained results from the evaluation of ABAA performance on a different dataset,
  show the generality and transferability of ABAA attack across different contexts, exhibiting
  context independence and maintaining relatively the same magnitude of negative impact on
  accuracy target model (refer to Table 7).

    From the results obtained, we showed the relevance of the design of adversarial examples
through modifications based on the task characteristics addressed with deep learning models.
Besides, we demonstrated that the advanced modifications were designed to attack the target
model, surpassing previous strategies effectively. Hence, we showed that our aspect-based
adversarial examples effectively degrade the accuracy of the reference results obtained as well as
in the semantics and syntax of the inputs preserving the fundamental characteristics of the
adversary examples. In this sense, we carried out modifications as small as possible but capable
of confusing the model.
   The negative impact that the task-oriented adversarial examples have on the models compels
us to continuously explore new vulnerabilities to propose defense mechanisms that allow them
to be covered effectively and consequently guarantee the trust of the result obtained through the
DL models. Therefore, we should analyze and propose further attack and defense methods since
the models are susceptible to attacks. Particularly, the accuracy of the deep learning models can
be decreased through adversarial examples, as we showed in this work. Different deep learning
models implemented in various areas, such as sentiment analysis, have not yet completely solved
their application problem, although improving a few percentage points or tenths of percentage
points is an uphill task. Hence, considering research on attack and defense methods is critical
since our attack presented decreased by over two tens of percentage points in the sentiment
analysis task.
   Thanks to the benefits that sentiment analysis models bring to the educational field, we
consider that models should incorporate, from their design, defense mechanisms to prevent a
future attack and mitigate negative consequences. We expect this work will motivate further
research and development of new attacks and defense for educational sentiment analysis models.

6. Conclusions
The main contribution of this work is the formalization of aspect-based adversarial examples
which considers the existing aspect-term relation to determine the terms to be modified. Unlike
previous works, our proposed strategy for generating aspect-based adversarial examples
considers aspect term information to drive the modifications that must be performed to
negatively impact the models' accuracy. This latter characteristic ensures that adversarial
examples maintain the input readability, semantics, and syntax obtaining a 99% semantic
similarity between the original input and its adversarial example and making accuracy models
drops 20.9%.
    For the experimental stage, we determine aspect-term relation based on the semantic
proximity of each term in an opinion concerning the evaluated aspect to filter the term that needs
to be modified. From the results obtained, it is possible to conclude that the aspect-based
adversarial examples have a positive impact on fooling the target model, making its accuracy
drastically drops. Moreover, since terms to be modified are selected by semantic similarity, this
minimizes the perceptibility of the modifications made. Besides, we evaluated the generality and
transferability of our proposed aspect-based adversarial examples by evaluating them on
different domain data sets, demonstrating context independence by maintaining a negative
impact on model results. As working directions, we will evaluate our aspect-based adversarial
examples on different target models with different adversary knowledge, different datasets, and
architectures such as BERT or attention mechanisms, which will allow us to compare our
proposal with current work that approaches adversarial examples generation for sentiment
analysis models at aspect-level.

Acknowledgments
This work is supported by CONAHCYT/México scholarship 814461. Besides, it was founded by
Catedras-CONAHCYT projects 882 and 613.


References
[1] B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer Berlin
    Heidelberg, 2011. URL: http://dx.doi.org/10.1007/978-3-642-19460-3. doi:10.1007/978-
    3- 642- 19460- 3.
[2] T. Shaik, X. Tao, Y. Li, C. Dann, J. McDonald, P. Redmond, L. Galligan, A review of the trends
    and challenges in adopting natural language processing methods for education feedback
     analysis,          IEEE         Access       10        (2022)        56720–56739.         URL:
     http://dx.doi.org/10.1109/ACCESS.2022.3177752.doi:10.1109/access.2022.3177752.
[3] B. Pang, L. Lee, S. Vaithyanathan, Thumbs up? sentiment classification using machine
     learning techniques, in: Proceedings of the 2002 Conference on Empirical Methods in Natural
     Language Processing (EMNLP 2002), Association for Computational Linguistics, 2002, pp.
     79–86. URL: https://aclanthology.org/W02-1011. doi:10.3115/1118693.1118704
[4] E. Riloff, J. Wiebe, Learning extraction patterns for subjective expressions, in: Proceedings of
     the 2003 Conference on Empirical Methods in Natural Language Processing, 2003, pp. 105–
     112. URL: https://aclanthology.org/W03-1014.
[5] S. Poria, D. Hazarika, N. Majumder, R. Mihalcea, Beneath the tip of the iceberg: Current
     challenges and new directions in sentiment analysis research, IEEE Transactions on Affective
     Computing 14 (2023) 108–132. URL: https://doi.org/10.1109/taffc.2020.3038167. doi:
     10.1109/taffc.2020.3038167
[6] B. Liu, L. Zhang, A survey of opinion mining and sentiment analysis, in: Mining text data,
     Springer, 2012, pp. 415–463. doi: 10.1007/978-1-4614-3223-4_13
[7] Z. Gong, W. Wang, B. Li, D. Song, W.-S. Ku, Adversarial texts with gradient methods (2018).
     URL: https://arxiv.org/abs/1801.07175. doi:10.48550/ARXIV.1801.07175.
[8] Y.-T. Tsai, M.-C. Yang, H.-Y. Chen, Adversarial attack on sentiment classification, in:
     Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural
     Networks for NLP, Association for Computational Linguistics, Florence, Italy, 2019, pp. 233–
     240. URL: https://aclanthology.org/W19-4824. doi: 10.18653/v1/W19- 4824.
[9] J. Li, S. Ji, T. Du, B. Li, T. Wang, TextBugger: Generating adversarial text against real-world
     applications, in: Proceedings 2019 Network and Distributed System Security Symposium,
     Internet Society, 2019. URL: https://doi.org/10.14722/ndss.2019.23138. doi:
     10.14722/ndss.2019.23138.
[10] M. Alzantot, Y. Sharma, A. Elgohary, B.-J. Ho, M. Srivastava, K.-W. Chang, Generating natural
     language adversarial examples, in: Proceedings of the 2018 Conference on Empirical
     Methods in Natural Language Processing, Association for Computational Linguistics, 2018.
     URL: https://doi.org/10.18653/v1/d18-1316. doi:10.18653/v1/d18- 1316.
[11] R. Jia, P. Liang, Adversarial examples for evaluating reading comprehension systems, in:
     Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing,
     Association for Computational Linguistics, 2017. URL: https://doi.org/10.18653/v1/d17-
     1215. doi:10.18653/v1/d17- 1215.
[12] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing
     properties of neural networks, arXiv preprint arXiv:1312.6199 (2013).
[13] W. E. Zhang, Q. Z. Sheng, A. Alhazmi, C. Li, Adversarial attacks on deep-learning models in
     natural language processing: A survey, ACM Transactions on Intelligent Systems and
     Technology          11      (2020)     1–41.     URL:     http://dx.doi.org/10.1145/3374217.
     doi:10.1145/3374217.
[14] B. Liang, H. Li, M. Su, P. Bian, X. Li, W. Shi, Deep text classification can be fooled, in:
     Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence,
     IJCAI-18, International Joint Conferences on Artificial Intelligence Organization, 2018, pp.
     4208–4215. URL: https://doi.org/10.24963/ijcai.2018/585. doi:10.24963/ijcai.2018/585.
[15] Mamta, A. Ekbal, Adversarial sample generation for aspect based sentiment classification, in:
     Y. He, H. Ji, S. Li, Y. Liu, C.-H. Chang (Eds.), Findings of the Association for Computational
     Linguistics: AACL-IJCNLP 2022, Association for Computational Linguistics, Online only, 2022,
     pp. 478–492. URL: https://aclanthology.org/2022.findings-aacl.44.
[16] X. Yuan, P. He, Q. Zhu, X. Li, Adversarial examples: Attacks and defenses for deep learning,
     IEEE Transactions on Neural Networks and Learning Systems 30 (2019) 2805–2824. URL:
     http://dx.doi.org/10.1109/TNNLS.2018.2886017. doi:10.1109/tnnls.2018.2886017.
[17] Y. Gil, Y. Chai, O. Gorodissky, J. Berant, White-to-black: Efficient distillation of black-box
     adversarial attacks, in: Proceedings of the 2019 Conference of the North American Chapter
     of the Association for Computational Linguistics: Human Language Technologies, Volume 1
     (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota,
     2019, pp. 1373–1379. URL: https://aclanthology.org/N19-1139. doi:10.18653/ v1/N19-
     1139.
[18] J. Zhang, C. Li, Adversarial examples: Opportunities and challenges, IEEE Transactions on
     Neural        Networks         and        Learning       Systems        (2019)     1–16.     URL:
     http://dx.doi.org/10.1109/TNNLS.2019.2933524. doi:10.1109/tnnls.2019.2933524.
[19] M. T. Ribeiro, S. Singh, C. Guestrin, Semantically equivalent adversarial rules for debugging
     NLP models, in: Proceedings of the 56th Annual Meeting of the Association for Computational
     Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne,
     Australia,       2018,       pp.     856–865.       URL:       https://aclanthology.org/P18-1079.
     doi:10.18653/v1/P18- 1079.
[20] J. Gao, J. Lanchantin, M. L. Soffa, Y. Qi, Black-box generation of adversarial text sequences to
     evade deep learning classifiers, in: 2018 IEEE Security and Privacy Workshops (SPW), IEEE,
     2018. URL: http://dx.doi.org/10.1109/SPW.2018.00016. doi:10.1109/spw.2018.00016
[21] D. Jin, Z. Jin, J. T. Zhou, P. Szolovits, Is BERT really robust? a strong baseline for natural
     language attack on text classification and entailment, Proceedings of the AAAI Conference on
     Artificial           Intelligence           34           (2020)           8018–8025.         URL:
     https://doi.org/10.1609/aaai.v34i05.6311. doi:10.1609/aaai.v34i05.6311.
[22] Y. Xu, X. Zhong, A. Jimeno Yepes, J. H. Lau, Grey-box adversarial attack and defence for
     sentiment classification, in: Proceedings of the 2021 Conference of the North American
     Chapter of the Association for Computational Linguistics: Human Language Technologies,
     Association            for         Computational             Linguistics,       2021.        URL:
     http://dx.doi.org/10.18653/v1/2021.            naacl-main.321.        doi:10.18653/v1/2021.naacl-
     main.321.
[23] Y. Ma, H. Peng, E. Cambria, Targeted aspect-based sentiment analysis via embedding
     commonsense knowledge into an attentive lstm, volume 32, Association for the
     Advancement             of       Artificial      Intelligence        (AAAI),      2018.      URL:
     http://dx.doi.org/10.1609/aaai.v32i1.12048. doi:10.1609/aaai.v32i1.12048.
[24] Y. Xiao, G. Zhou, Syntactic edge-enhanced graph convolutional networks for aspect-level
     sentiment classification with interactive attention, IEEE Access 8 (2020) 157068–157080.
     URL:                                          http://dx.doi.org/10.1109/ACCESS.2020.3019277.
     doi:10.1109/access.2020.3019277.
[25] W. Wang, R. Wang, J. Ke, L. Wang, Textfirewall: Omni-defending against adversarial texts in
     sentiment        classification,     IEEE     Access       9    (2021)      27467–27475.     URL:
     http://dx.doi.org/10.109/ACCESS.2021.3058278. doi:10.1109/access.2021.3058278.
[26] M. Vázquez-Hernández, L. Villaseñor-Pineda, M. Montes-y Gómez, A semantic-proximity
     term-weighting scheme for aspect category detection, Procesamiento del Lenguaje Natural
     (2022) 117–127. URL: https://doi.org/10.26342/2022-69-10. doi:10.26342/2022- 69- 10.