<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Comparative Study On Sentiment Lexicons For Automatic Labeling</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zohra Mehenaoui</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chayma Merabti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Houda Tadjer</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yacine Lafi</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department, University 8 Mai 45</institution>
          ,
          <addr-line>Guelma</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LabSTIC laboratory, University 8 Mai 45</institution>
          ,
          <addr-line>Guelma</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Sentiment analysis is a natural language processing task that involves extracting meaningful information concerning people's opinions and sentiments towards products, services, and more, which can be utilized in several applications. This task requires using data presented on online platforms. With the increasing use of the World Wide Web, a huge amount of data can be exploited. To do so, this data should be present in a suitable format like datasets. A sentiment analysis dataset generally requires reviews or comments and their sentiment labels (positive, negative, or neutral). Experts can do the labeling task manually, but this requires a lot of time and energy, especially when dealing with a massive data size.In this paper, we performed automatic labeling of a dataset consisting of 5200 comments of students using diferent well-known sentiment analysis tools which are VADER, TextBlob, SpaCy, and SentiWordNet, making a comparison of these tools to find the most eficient one for automatic labeling of this dataset. The results showed that TextBlob outperforms the other tools with an accuracy of 92% and an F1-score of 89%.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Sentiment analysis</kwd>
        <kwd>labeling</kwd>
        <kwd>VADER</kwd>
        <kwd>TextBlob</kwd>
        <kwd>SentiWordNet</kwd>
        <kwd>SpaCy</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Online reviews reflect user sentiments and preferences. At least 32% of users rate products on shopping
sites. 33% of them leave reviews, with 88% trusting them. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Sentiment Analysis (SA), also known as
Opinion Mining (OM) is a Natural Language Processing (NLP) activity that entails gathering relevant
data to determine people’s opinions and sentiments on products, services, etc [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The main process
of sentiment analysis is sentiment classification, this process can be performed generally using two
approaches:the lexicon-based approach and the machine learning-based approach, some of them went
to use the hybridisation of both of them.
      </p>
      <p>
        The lexicon-based approach relies on sentiment lexicons, that are dictionaries containing a collection
of sentiment words labeled with their corresponding polarity (positive, negative, or neutral) and their
sentiment scores. The sentiment of the entire statement is generally determined by either adding all
scores of sentiment words or calculating their mean [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This approach is divided into two approaches:
the dictionary-based approach and the corpus-based approach: The first approach requires using a set
of sentiment words collected manually, then creating a dictionary by adding more words like their
synonyms and antonyms, for example : WordNet [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The corpus-based approach adds sentiment words
that are specific to the Study domain.[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>The machine learning-based approach utilizes supervised learning for sentiment classification,
employing labeled datasets. These datasets are typically partitioned into training and testing sets. The
training set is used to train the classifiers of machine learning such as Support Vector Machine (SVM),
Naïve Bayes (NB), Logistic Regression (LR), and others. Subsequently, the testing set is employed to
assess machine learning model’s performance.
13th International Conference on Research in ComputIng at Feminine, May 20–21, 2024, Constantine, Algeria
$ mehenaoui.zohra@univ-guelma.dz (Z. Mehenaoui); merabti.chayma@univ-guelma.dz (C. Merabti);
tadjer.houda@univ-guelma.dz (H. Tadjer); lafi.yacine@univ-guelma.dz (Y. Lafi)</p>
      <p>0000-0002-6732-7839 (Z. Mehenaoui); 0009-0006-7254-6215 (C. Merabti); 0000-0001-7624-1343 (H. Tadjer);
0000-0001-8232-4196 (Y. Lafi)</p>
      <p>© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>To enhance the accuracy of the classification task and leverage the strengths of both approaches,
many researchers went for the hybridization of these methodologies in sentiment classification, termed
the hybrid approach.</p>
      <p>Utilizing either lexicon-based or machine learning-based approaches necessitates labeled datasets,
which entails assigning labels to individuals’ reviews or comments regarding entities, such as products.
This annotation task typically demands manual efort, resulting in a slow and costly process, particularly
when dealing with large datasets. Consequently, some choose automatic labeling methods to reduce
time and resources. This paper presents a comparative study of four well-known and used sentiment
lexicons and tools VADER, TextBlob, SentiWordNet, and SpaCy to Auto-label dataset from Mark My
Professor educational website.</p>
      <p>The rest of the paper is organized as follows: Section 2 reviews existing literature pertinent to
our study, linking ideas and providing context. Section 3 details the methodology followed by the
comparative study presented in this paper. The obtained results are presented and discussed in Section
4, further elucidating our study’s outcomes. Finally, Section 5 will serve as the study’s conclusion,
summarizing key findings and ofering insights for future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        With the increasing use of online platforms, users’ reviews and comments have become exploitable data
for researchers who apply sentiment analysis in various application domains. Some of these studies
were done using available datasets such as SST [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], SemEval [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and more. Therefore, many researchers
collected their own datasets or used unlabeled ones, so they performed the labeling task; some of
them labeled their datasets manually. For instance, Liu et al., [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] used manually labeled data to train a
language model. On the other hand, several studies have used sentiment lexicons to label the dataset
automatically. For example, Isnan et al., [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] applied sentiment analysis to the data collected from TikTok
reviews on Google Play, where they used VADER for the initial labeling (positive, negative, and neutral)
and performed sentiment classification using the SVM classifier. Borg and Anton [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] used VADER
along with a Swedish sentiment lexicon to initially label 168010 e-mails to classify sentiments. There
are a few studies where the authors used ratings to label their datasets, with each review having its
corresponding rating. In the study of building a sentiment analysis model for an android Application
named KlikIndomaret during COVID-19 pandemic using VADER lexicon and transformers NLTK
Library, the labeling task was done based on stars rating of reviews on Indonesian shopping application
in Google play store [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Tama et al., [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] applied sentiment analysis to the dataset of Grocery and
Gourmet Food from Amazon after labeling it using star ratings of the reviews, where they compared
two labeling methods named Average and Binary. The average method labels based on the average
rating adjusted to the amount of data available, while the binary labeling divides labeling by using
certain assumptions. The results showed that the average method performed better than the binary
method.
      </p>
      <p>
        Bonta et al., [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] compared NLTK, TextBlob, and VADER for movie review classification, finding
VADER outperformed the others (77% accuracy vs. TextBlob 74% and NLTK 62%). TextBlob was
compared with SpaCy in another study where the results showed that TextBlob was faster than SpaCy
while SpaCy produced more accurate results and showed the results in visual forms using charts and
graphs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        An evaluation of diferent well-known lexicons was conducted on two Twitter datasets. The results
from the Stanford dataset demonstrated that VADER outperformed other lexicons with an accuracy of
72%. AFINN-111 and Liu-Hu achieved 65%, while SentiWordNet and SentiStrengh achieved 53% and
67%, respectively.[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        Another comparative study was conducted by Biswas et al., [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] in which they compared three
automatic lexicon-based sentiment labeling techniques: TextBlob, VADER, and AFINN, to assign
sentiments to two tweet datasets, SemEval-2013 and SemEval-2016, without any human assistance. The
AFINN labeling technique achieved the highest accuracy of 80.17% in the first dataset and 80.05% in the
second using a BiLSTM deep learning model.
      </p>
      <p>
        Automatic labeling of datasets can be performed using machine learning classifiers too where in
the study of Jazuli et al.,[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], the authors used K-Nearest Neighbors algorithm method to improve the
accuracy of sentiment analysis, the results showed that using K-nearest neighbors gave the accuracy
of 79.43% with a value of k=15 after using it with 1.409 data. Most of the works cited earlier have
worked on datasets derived from social networks. In this study, we test labeling tools on a dataset from
a diferent context, which is that of e-learning, since our objective is to leverage learner reviews in
online learning environments to enhance the learning process.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Dataset</title>
        <p>This section describes the methods used to achieve the study’s objectives, which involve applying
sentiment lexicons to label the dataset used in the research.</p>
        <p>
          We used the dataset collected from Mark My Professor website: a Hungarian website dedicated to
evaluate higher education teachers and trainers by their students [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. This dataset consists of 5200
reviews from learners regarding the courses presented by their professors: 3372 positive reviews, 982
negative reviews, and 846 neutral reviews. These reviews are collected in Hungarian language, we used
the English translation of the comments to examine the chosen tools.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Sentiment lexicons</title>
        <p>
          Sentiment lexicons are lexical resources used for sentiment analysis. They contain lexical units with
their sentiment polarities or sentiment scores used to determine the overall sentiment of the written
text[18], sentiment lexicons that are used in this study are:
3.2.1. VADER
It’s a vocabulary and rule-based sentiment analysis tool that is especially adapted to the sentiments
expressed on social media. VADER is an acronym for Valence Aware Vocabulary and Sentiment Reasoner,
created by Hutto and Gilbert [19] to address the issue of interpreting the vocabulary, symbols, and
writing style found in social media. With its ability to distinguish between the text’s emotional strength
and polarity (positive, neutral, or negative), the authors have made the lexicon’s Python code public
as open source. VADER is widely used in social media platforms such as Twitter due to its ability to
recognize abbreviations and written emojis. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Although the dataset used in our study was not taken
from social media but due to its wide range of use, we aimed to assess its performance in e-learning
reviews.
        </p>
        <p>VADER doesn’t require any preprocessing thanks to its handling of extra letters (like ’gooood’),
emojis, capitalization, etc. It can be installed easily with this Python instruction:
pip install VADERSentiment
It analyzes every word in the sentence to see if that word is included in the VADER lexicon. By
applying the ‘polarity_scores()‘ function, it finds polarity indices and returns the metric values
of positive, negative, and neutral, as well as the compound score, which is the calculation of the sum
of the normalized polarity indices. The scores range from -1 to +1, where a score of -1 indicates the
most extreme negative sentiment, while a score of +1 indicates the most extreme positive sentiment.
To determine the overall sentiment of a statement, standardized thresholds are set and used for the
classification process. We used the typical threshold values, which are:
For text with positive Sentiment, the compound score is &gt;= 0.05 , for text with neutral Sentiment, the
compound score is &gt; -0.05 and &lt; 0.05 and for text with negative Sentiment, the compound score is &lt;=
-0.05. Table 1 shows some of VADER classification of three samples token from the dataset.
Useful subject and interesting presentation. 0.0
Totally correct, fulfilling requirement. 0.0
I’m very disappointed with the quality of 0.176
feedback on my last assignment.</p>
        <p>Neg</p>
        <p>Neu</p>
        <sec id="sec-3-2-1">
          <title>3.2.2. TextBlob</title>
          <p>
            It is a preferred open-source, easy-to-use Python NLP library used for text processing, encompassing
tasks such as sentiment analysis through labeling, tokenization, etc. It features a sentiment property
that yields a tuple in the form of Sentiment (polarity, subjectivity), where the polarity ranges from -1.0
to 1.0 (from highly negative to highly positive) and subjectivity from 0.0 to 1.0 (from highly objective
to highly subjective). [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ].Same as VADER, TextBlob classifies the sentiment of a given text using a
threshold. The one that we are using is the same as VADER’s. We used TextBlob by importing it in
Python using the instruction:
from TextBlob import TextBlob
I got max points in my exam :D
Totally correct, fulfilling requirement.
          </p>
          <p>I’m feel intruth anxious about the approaching examination .</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.3. SentiWordNet</title>
          <p>
            It is a tool that analyzes sentiments using WordNet, assigning scores based on evaluations by judges to
word sets regarding positivity, negativity, or neutrality. Scores are assigned with a numerical range
from 0 to 1, where higher values indicate positivity and vice versa. Used in NLP to gauge the tone of
words and phrases in written content for sentiment analysis, opinion mining, and text categorization.
[
            <xref ref-type="bibr" rid="ref14">14</xref>
            ].
          </p>
          <p>
            The overall sentiment of a statement using SentiWordNet is positive when the positive score is greater
than the negative score, negative when the negative score is greater than positive score and neutral
otherwise. positive and negative scores are the degree of positive and negative assigned to each text,
the degree of objectivity can be calculated as 1 - (positive score + negative score). We used the version
of SentiWordNet available in NLTK corpus by importing it with the instuction:
from nltk.corpus import sentiwordnet as swn
3.2.4. SpaCy
Is a fast and eficient Python natural language processing (NLP) library that utilizes ML. It provides
pre-trained models for various languages and works based on an array of features for text handling:
tokenization, parts of speech (POS) tagging, entity recognition, and dependency parsing. It comes with
a friendly interface and brief documentation, making it preferred by scholars, programmers for chatbots,
sentiment analysis, etc. [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]. Using SpaCy requires the installation of the tool with the instuction:
pip install SpaCy
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and discussions</title>
      <p>After applying the labeling of the chosen dataset, the classification results using VADER and TextBlob
are illustrated in figure 1, and SpaCy and SentiWordNet in figure 2 .</p>
      <p>(a) VADER
(b) TextBlob</p>
      <p>Evaluating tool performance in auto-labeling requires assessing metrics like:</p>
      <p>Accuracy =</p>
      <p>TP + TN
TP + TN + FP + FN</p>
      <p>The results of measuring these performance metrics showed that TextBlob reached the highest
accuracy 92.35%, SpaCy and VADER were close to each other with accuracy of 77.62% and 77.06%
respectively, while SentiWordNet had the lowest accuracy of 71.40%.</p>
      <p>TextBlob outperformed the other tools with F1-score too where it reached F1-score of 89.23% , VADER
and SpaCy were close with F1-score of 70.78% and 71.68% respectively, and SentiWordNet fell to the
value of 56.43%. Table 5 summarizes performance metrics results of the four tools.</p>
      <p>Since VADER did not achieve good results, even though he proved his efectiveness in labeling, it
primarily specialises in social media, while the dataset used was taken from an educational website.
The same goes for SpaCy, as it used VADER to classify sentiments; therefore his results do not difer
much from VADER. Rather, it is considered an improvement in its results. SentiWordNet performed not
well, even though the dataset was not large, and it took the most time during the process of calculating
the sentiment score, in contrast TextBlob performed the best with ease and speed of use, which makes
it a good choice in data similar to this context.</p>
      <p>One of the limitations of the lexicon-based tools such as the tools we worked with is that they can’t
detect sarcasm because they don’t take into consideration the semantic meaning of the sentence, so
that’s a challenge for them.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Labeling a collected or unlabeled dataset presents a significant challenge within the research community
due to its possible impact on the reliability of research findings, particularly those sensible to even minor
errors. Manual labeling can be expensive process in terms of time and expert resources. To address
this challenge, we examined well-known sentiment lexicons used researchers in their investigations,
focusing on their application within e-learning domains using a dataset outlined in previous sections.
Our analysis revealed that TextBlob outperformed other tools, achieving an accuracy of 92.34% and
an F1-score of 89.23%. While SpaCy and VADER exhibited relatively close performance, with lower
accuracy of 77.62% and 77.06%, respectively, SentiWordNet displayed the lowest accuracy at 71.40%.
Recognizing that a lexicon’s eficacy and limitations may dependent on the dataset and context, it is
plausible that SentiWordNet and VADER may excel under diferent circumstances. Therefore, TextBlob’s
supremacy in this study does not unequivocally consider it as the optimal tool for lexicon-based labeling
sentiment analysis datasets. Furthermore, developing a lexicon tailored specifically for online reviews
related to e-learning, coupled with its application on similar datasets, could yield enhanced performance.
Additionally, the utilization of machine learning techniques like semi-supervised learning, deep learning
models and transformers (such as DistilBERT transformer) for labeling holds promise for delivering
more precise results due to their efectiveness in capturing semantic relationships among words.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT to correct errors and improve the
clarity of certain paragraphs, as well as Grammarly for grammar and spelling checks. All content
generated or suggested by these tools was critically reviewed and edited by the authors. The author(s)
afirm full responsibility for the accuracy, originality, and integrity of the final manuscript.
[18] R. S. Jagdale, V. S. Shirsat, S. N. Deshmukh, Review on sentiment lexicons, in: 2018 3rd International</p>
      <p>Conference on Communication and Electronics Systems (ICCES), IEEE, 2018, pp. 1105–1110.
[19] C. Hutto, E. Gilbert, Vader: A parsimonious rule-based model for sentiment analysis of social media
text, in: Proceedings of the international AAAI conference on web and social media, volume 8,
2014, pp. 216–225.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Vedavathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>KM</surname>
          </string-name>
          , E
          <article-title>-learning course recommendation based on sentiment analysis using hybrid elman similarity</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>259</volume>
          (
          <year>2023</year>
          )
          <fpage>110086</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Birjali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kasri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Beni-Hssane</surname>
          </string-name>
          ,
          <article-title>A comprehensive survey on sentiment analysis: Approaches, challenges and trends</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>226</volume>
          (
          <year>2021</year>
          )
          <fpage>107134</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Nandwani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verma</surname>
          </string-name>
          ,
          <article-title>A review on sentiment analysis and emotion detection from text</article-title>
          ,
          <source>Social network analysis and mining 11</source>
          (
          <year>2021</year>
          )
          <fpage>81</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis</article-title>
          and
          <source>opinion mining</source>
          , Springer Nature,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Improving sentiment analysis via sentence type classification using bilstm-crf and cnn</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>72</volume>
          (
          <year>2017</year>
          )
          <fpage>221</fpage>
          -
          <lpage>230</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Peng</surname>
          </string-name>
          , G. Fortino,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <article-title>Modeling multi-aspects within one opinionated sentence simultaneously for aspect-level sentiment analysis</article-title>
          ,
          <source>Future Generation Computer Systems</source>
          <volume>93</volume>
          (
          <year>2019</year>
          )
          <fpage>304</fpage>
          -
          <lpage>311</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>K.-L. Liu</surname>
            ,
            <given-names>W.-J.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Guo</surname>
          </string-name>
          ,
          <article-title>Emoticon smoothed language models for twitter sentiment analysis</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>26</volume>
          ,
          <year>2012</year>
          , pp.
          <fpage>1678</fpage>
          -
          <lpage>1684</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Isnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. N.</given-names>
            <surname>Elwirehardja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Pardamean</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis for tiktok review using vader sentiment and svm model</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>227</volume>
          (
          <year>2023</year>
          )
          <fpage>168</fpage>
          -
          <lpage>175</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Borg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Boldt</surname>
          </string-name>
          ,
          <article-title>Using vader sentiment and svm for predicting customer response sentiment</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>162</volume>
          (
          <year>2020</year>
          )
          <fpage>113746</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Budianto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Wirjodirdjo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Maflahah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kurnianingtyas</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis model for klikindomaret android app during pandemic using vader and transformers nltk library, in: 2022 IEEE international conference on industrial engineering and engineering management (IEEM)</article-title>
          , IEEE,
          <year>2022</year>
          , pp.
          <fpage>0423</fpage>
          -
          <lpage>0427</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>V.</given-names>
            <surname>Tama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sibaroni</surname>
          </string-name>
          , et al.,
          <article-title>Labeling analysis in the classification of product review sentiments by using multinomial naive bayes algorithm</article-title>
          ,
          <source>in: Journal of Physics: Conference Series</source>
          , volume
          <volume>1192</volume>
          ,
          <string-name>
            <given-names>IOP</given-names>
            <surname>Publishing</surname>
          </string-name>
          ,
          <year>2019</year>
          , p.
          <fpage>012036</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Bonta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kumaresh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Janardhan</surname>
          </string-name>
          ,
          <article-title>A comprehensive study on lexicon based approaches for sentiment analysis</article-title>
          ,
          <source>Asian Journal of Computer Science and Technology</source>
          <volume>8</volume>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Pandey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Jindal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Batra</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis using lexicon based approach</article-title>
          ,
          <source>IITM Journal of Management and IT</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <fpage>68</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Al-Shabi</surname>
          </string-name>
          ,
          <article-title>Evaluating the performance of the most important lexicons used to sentiment analysis and opinions mining</article-title>
          ,
          <source>IJCSNS</source>
          <volume>20</volume>
          (
          <year>2020</year>
          )
          <article-title>1</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Biswas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Young</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Grifith</surname>
          </string-name>
          ,
          <article-title>A comparison of automatic labelling approaches for sentiment analysis</article-title>
          ,
          <source>arXiv preprint arXiv:2211.02976</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jazuli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Widowati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kusumaningrum</surname>
          </string-name>
          ,
          <article-title>Auto labeling to increase aspect-based sentiment analysis using k-nearest neighbors method</article-title>
          ,
          <source>in: E3S Web of Conferences</source>
          , volume
          <volume>359</volume>
          ,
          <string-name>
            <given-names>EDP</given-names>
            <surname>Sciences</surname>
          </string-name>
          ,
          <year>2022</year>
          , p.
          <fpage>05001</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>I. Bouacida</surname>
          </string-name>
          ,
          <article-title>Sentiment Analysis and Opinion Mining Techniques for Learning Analytics</article-title>
          ,
          <source>Master's thesis</source>
          , Eötvös Loránd University, Budapest, Hungary,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>