<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>A. Bhardwaj);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>MINDS: A Multi-label Emotion and Sentiment Classification Dataset Related to COVID-19</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anjali Bhardwaj</string-name>
          <email>bhardwaj.anjali200594@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Muhammad Abulaish</string-name>
          <email>abulaish@sau.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, South Asian University</institution>
          ,
          <addr-line>New Delhi</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>During times of crisis, such as the COVID-19 pandemic, there has been a sudden increase in information exchanges on social media. People gathered to share their feelings, experiences, knowledge, and ideas with one another. Twitter has emerged as the most authentic, admired, and widely used social media platform for users to express their sentiments, opinions or emotions. Understanding the emotions expressed in text facilitates the development of empathetic machines. In this paper, we intend to collect and annotate a large corpus of textual data that can be used to train classification models for detecting emotions and sentiments in user-generated content. We analyze user sentiments and emotions expressed in tweets during the third wave of the omicron sub-variant pandemic. Our curated dataset, MINDS, consists of 227, 229 tweet instances that were annotated using a multi-model setup in order to quantify all aspects of model uncertainty. Each instance in the dataset is classified according to three sentiment classes (positive, negative, and neutral) and five emotion classes disgust, and anger). We have used a well-performing benchmark dataset related to SemEval-2018's E-c subtask for determining the classification threshold. The MINDS datasets is publicly available at http://www.abulaish.com/ldsa/dataset for research purposes.</p>
      </abstract>
      <kwd-group>
        <kwd>Emotion dataset</kwd>
        <kwd>Sentiment dataset</kwd>
        <kwd>Multi-label classification</kwd>
        <kwd>NLU</kwd>
        <kwd>COVID-19</kwd>
        <kwd>IBM Watson</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Human emotion is one of the fundamental components of cognitive processes. Without
emotions, humans would be lifeless stones; emotions are what keep us alive and define who we
are. Changes in the physiological aspect frequently convey emotions, as they correspond to
psychological states that occur spontaneously and without conscious efort. It is a complex
feeling caused by internal or external events concerning an object, such as a person, a topic, an
event, or an item, resulting from thought processing, e.g., “my family thinks it’s a good idea
for me to continue my education overseas, though they’ll miss me.” Emotions are the most
significant aspect of human understanding and have a positive impact on our physical health,
jobs, learning, economic, and social behaviors. In addition, whenever a choice must be made,
humans seek the opinions of others.</p>
      <p>During events such as pandemics, unrest, etc., there is a torrent of emotions. As people faced
the unprecedented challenge of COVID-19, their emotional responses became overwhelming
and erratic. Since November 2019, the global community has been dealing with this issue, which
has been disastrous for humanity. And due to social media, individuals were suddenly able to
share their experiences. In order to be together safely in the future, one had to be apart in the
interim.</p>
      <p>
        The classification of emotions from textual data is more complex than sentiment classification.
Although emotions and sentiments are synonymous, in sentiment analysis they represent two
distinct concepts [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The AI hot topic of emotion and sentiment classification has numerous
applications in the current technological era. Emotion analysis is used in a variety of AI-based
applications, including human-machine interface, cognitive psychology, intelligent devices,
automated identification, etc., which have become the gold standard. Thus, research on this
topic enables the development of empathic machines [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Numerous social media platforms, including Instagram, Facebook, etc., have become
indispensable for communicating with friends and family. In online social media, emotions are
typically expressed through emoticons and texts that are unstructured, informal, and massive
in size. Due to its immense size and unstructured nature, it is dificult to extract meaningful
emotional information from such a repository of data. Moreover, diferent types of emotions,
such as fear, anger, sadness, disgust, joy, and surprise, are not mutually exclusive; rather, they
are interconnected, such that one emotion causes/triggers other emotion(s) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Therefore,
a document can be tagged with multiple emotions, making emotion detection a multi-label
classification problem requiring deep learning. Multiple mutually non-exclusive classes or labels
are predicted by deep learning algorithms [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However, it requires an enormous quantity of
labeled data. In order to train and evaluate algorithms for multi-label classification, we intend
to curate and label a large dataset.
      </p>
      <p>
        Our contributions are summarized as follows:
• Curation of a large-scale multi-labeled dataset, MINDS (Multi-label emotIoN and
sentiment classification DataSet), of textual data containing 227, 229 instances. The tweets
were collected using the top-six trending hashtags – #Omicron, #OmicronVariant,
#OmicronVarient, #OmicronInIndia, #OmicronVirus, and #Omicronvirus india from December
17, 2021 to February 4, 2022 using Twitter’s Tweepy API.
• Classification of each tweet into the most appropriate emotion categories (annotation
labels), which are amongst the universal emotions of Ekman [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], namely joy, sadness, fear,
anger, surprise, and disgust, except surprise. The surprise emotion is dropped due to its
ambiguous appearance, indicating both positive and negative sentiments. The sentiment
categories comprise positive, negative or neutral, with numeric scores.
• Using a multi-model setup, namely IBM Watson NLU API, Komprehend Text Analysis
API, and Text2emotion Python package to annotate the dataset to quantify all aspects of
model uncertainty.
• Based on the SemEval-2018 E-c subtask dataset, determining a classification threshold.
      </p>
      <p>After determining the threshold value, the micro-average and macro-average f1-score for
each emotion were analyzed. We obtained a Jaccard index of 0.52 and micro-average and
macro-average f1-score values of 63% and 61%, respectively.</p>
      <p>The remaining sections are organized as follows. The section 2 provides a brief overview of
the related datasets for emotion and sentiment classification on textual data based on multi-label
classification. Section 3 provides background information on our work. Section 4 provides the
functional steps involved in the creation of our dataset. Finally, section 5 concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Datasets</title>
      <p>
        Multi-label emotion and sentiment classification have increasingly become an active research
topics. Many researchers investigated multi-label emotion classification in textual data [
        <xref ref-type="bibr" rid="ref3 ref4 ref6 ref7">3, 6, 7, 4</xref>
        ].
The authors used multiple Convolution Neural Network (CNN) along with self-attention and
performed multi-label emotion classification on Twitter data [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Similarly, the authors [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] used
transfer learning to enhance the multi-label emotion classification performance on Twitter
data.
      </p>
      <p>
        Mostly datasets are hand-annotated, some of them are SemEval-2018 Task 1: Afect in Tweets
(AIT) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], GoEmotions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], EMOBANK [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and Emotion Intensities (EmoInt) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. EmoInt
dataset [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] was created for detecting the emotional intensities of four emotions in tweets.
      </p>
      <p>
        SemEval-2018 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is a multilingual dataset used to train and test supervised machine learning
algorithms. This dataset contains 10, 690 instances, annotated based on 11 emotions categories.
The shared task evaluates automatic systems for E-c (multi-label Emotion-classification), EI-oc
(Emotion Intensity-ordinal classification), EI-reg (Emotion Intensity-regression), V-oc
(Valenceordinal classification), and V-reg (Valence-regression) detection in three languages - English,
Spanish, and Arabic. Moreover, the authors detected valence, an ordinal class of intensity of
emotion (slightly sad, very angry, etc.), and detected an ordinal class of valence (or sentiment).
      </p>
      <p>
        In GoEmotions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] consists of 58 instances collected from English Reddit comments and
categorized into 27 emotion categories and neutral classes. The authors used principal preserved
component analysis and conducted transfer learning experiments with existing benchmarks to
show that their dataset generalizes well to other domains. EMOBANK [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is another dataset
containing above 10, 000 sentences labeled according to the emotion representation model of
VAD. The authors collected data from various sources, including essays, blogs, fiction, travel
guides, letters, newspapers, and news headlines of readers and writers. In addition, a subset
of the concerned dataset was categorically labeled using Ekman’s emotion model, making it
desirable for dual representational designs.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Background</title>
      <p>This section briefly discusses social media platforms, mainly Twitter and their content.
Moreover, we discuss well-known APIs, which integrate the processing of textual data to deliver the
evoked emotions and sentiments.</p>
      <sec id="sec-3-1">
        <title>3.1. Social Media</title>
        <p>Social media platforms have become modern communication tool and have a large user base
worldwide. Among these platforms, Twitter is the most authentic, admired, and extensively
used site by users to share their information, thoughts, ideas, and opinions/emotions
simultaneously regarding social events, products, services, and political and marketing campaigns. Due to
constant updates on the repository of opinions, banter, facts, and other minutiae, Twitter has
garnered much interest from decision-makers, business leaders, and politicians. This is because
of the inherent desires of knowing people’s perspectives and opinions regarding specific topics.
Therefore, we choose Twitter data to analyze the emotions expressed in tweets. Collecting
data using API is the most popular and recommended practice. Almost every social media
service provider has an API that assists users with several libraries or packages in various
data-extracting activities.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Well-known APIs</title>
        <p>Over the last few years, APIs (Application Programming Interfaces) have increased, breaking
the barriers of using a third-party text analytic functionality rather than building their model.
APIs play a significant role in today’s digital age as they encourage innovation, ofer flexible
experiences, save cost, and make easy availability. There are dozens of APIs which are essential
for the digital transformation, creation, and development of innovative models. The majority
of datasets are annotated manually and tend to be small. Moreover, there is no API-based
multi-label emotion and sentiment classification dataset is available. So, we annotated our
dataset MINDS using a multi-model setup (i.e., APIs and python package) which are discussed
below:</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. IBM Watson NLU</title>
          <p>IBM developed Watson NLU (Natural Language Understanding)1 API that analyzes data
with the help of text analytics for extracting categories, classification, concepts, keywords,
relations, entities, emotion, sentiment, semantic roles, and syntax. It uses deep learning to
extract the meanings and metadata from textual data that is unstructured. Moreover, NLU
supports 25 languages depending on whose features one analyzes. Sentiment features analyze
the sentiments towards specific phrases in target and also the overall document’s sentiment.
While, emotion features analyze emotion presented by specific target phrases or the document
itself.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Komprehend Text Analysis</title>
          <p>This is the most comprehensive document classification and NLP API 2 for software developers.
This API trained their NLP models on more than a billion documents and provided
state-of-theart performance on most common use cases of NLP, such as sentiment analysis and emotion
detection. Moreover, this API supports 15 languages. The main advantages of this API is that it
works on diverse data, is accurate, supports flexible deployment, and maintains privacy. This
API used Long Short-Term Memory (LSTM) algorithms which divide text blob sentiment’s into
positive and negative. LSTMs represents sentences as a series of context-based forget-remember
decisions. It was trained diferently to handle informal and formal language based on social
media and news data. Moreover, this algorithm is trained using various specific datasets for
1https://cloud.ibm.com/apidocs/natural-language-understanding
2https://komprehend.io/
diferent clients. Sometimes the sentiment classes- positive, negative, and neutral are not enough
to understand the aspects associated with the underlying tone of any sentence. As a result, the
emotion analysis classifier of this API is trained on the proprietary dataset. It can determine
whether the emotion conveyed through the text is sadness, happiness, fear, anger, bored, or
excitement. Deep learning-based algorithms were used to capture features from the text data.
These features were employed to categorize the emotions conveyed by the data. The classifier
was trained using CNNs on a tagged dataset.</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Text2emotion</title>
          <p>This is the python package3 that extracts the emotions from the text, and it is compatible with
ifve emotion categories: sad, happy, angry, fear, and surprise. Details of the package are as
follows:
• Text pre-processing: The primary goal is to promote data cleaning while making the
content more suitable for emotion analysis. The unwanted textual part is removed from
the text; NLP techniques were used to identify the well-pre-processed text.
• Emotion investigation: Pre-processed text are analyzed, and appropriate words are found
that express emotions or feelings and category of emotions. The count of emotions
relevant to the words are stored.
• Emotion analysis: The output of the text is in the form of a dictionary where the emotion
categories and scores are represented as keys and values, respectively. A larger score of a
particular emotion category, indicates the text belonging to that category.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Dataset Creation</title>
      <p>This section gives the functional steps of our dataset creation and analysis, including data
extraction, data preprocessing, dataset annotation, evaluations, and observations for emotion and
sentiment classification. Fig. 1 shows the framework of our dataset annotation approach.</p>
      <p>Twitter</p>
      <p>Data
Crawling
Twitter REST API</p>
      <p>Data</p>
      <p>Preprocessing
RAW tweets</p>
      <p>Pre-processed
tweets</p>
      <p>Multi-Model</p>
      <p>Sentiments
IBM Watson
NLU API
Komprehend</p>
      <p>API
Python Package
Text2Emotion</p>
      <p>Positive Neutral</p>
      <p>Negative
Emotions</p>
      <p>3https://pypi.org/project/Text2emotion/</p>
      <sec id="sec-4-1">
        <title>4.1. Data Extraction</title>
        <p>We have created a data crawler utilizing Python 3.6, and Twitter Tweepy-4.6.0 API to collect
COVID-19 related posts, mainly those of Omicron (sub-variant of the COVID-19 virus
responsible for the third wave of infections) in the English language and stored them in a local repository.
To find valuable insights from public reactions and shared posts on Twitter and model public
emotions, we collected tweets from December 17, 2021 to February 4, 2022. The tweets are
collected using six top trending hashtags such as #Omicron, #OmicronVariant, #OmicronVarient,
#OmicronInIndia, #OmicronVirus, and #Omicronvirus india. Table 1 shows the details of the
Twitter Omicron corpus containing 2, 41, 419 instances of raw tweets when the Omicron wave
had just begun. Through Twitter API, we obtained various tweet-related information, such
as tweet id, text, author screen name, author id, created at, source, user verified, count, language,
favorite count, username, user id, location, etc.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Data Pre-processing</title>
        <p>Pre-processing is an essential step before analyzing any dataset to deal with the problem of
outliers and noises for a proper representation of data. Corpuses curated from Twitter as
the one presented in Table 1 are often noisy and contain unwanted or irrelevant constituents;
eliminating such kind of undesirable information is vital for better performance and eficiency
of the system. Since tweets are generated in an uncontrolled manner which is generally
unstructured and informal, basic pre-processing steps are applied to transform input data into
the ready-to-analyze format. The following steps were performed as part of the pre-processing
stage:
1. The uppercase letters were converted into lower-case letters to normalize the input text
data.
2. HTML tags, URLs, extra white spaces, hexa-characters (UTF-8), double quotes, and
duplicate tweets were removed to promote ease of further processing.
3. The stop-words, punctuation, and emoticons are not removed because they play a
significant role in understanding the context of tweets.
not enough people through the doors to cover wages, much less rent and food costs and our
#omicron wave hasnt even hit yet.
#omicron is not mildno one against the businessmen/women how bravely the government stands
against them? or they reciprocally the same entity to another? “a” is not “all the time, sometime”
could be “a” in one point, and “a” acted on behalf of.
so proud of the scientists and ex-colleagues who continue to keep norwich safe i miss working
with you all - you’re so fantastic at what you do #norwich #omicron</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Dataset Annotation</title>
        <p>As presented in Table 1, 241, 419 tweets were scraped from Twitter, after which pre-processing
resulted in 227, 229 instances of filtered tweets. Due to highly emotion-rich data from</p>
        <sec id="sec-4-3-1">
          <title>Twitter,</title>
          <p>a multi-model setup is employed to label the pre-processed tweets and analyze the diferent user
emotions within them. The main advantage of such a setup is to diminish the efect of model
errors along with bias that results from each individual model. Additionally, a multi-model setup
promotes reliability, consistency, and better labeling by performing automatic text annotation
to annotate large quantities of data in less time.</p>
          <p>
            Each independent API predicts scores respectively to identify the evoked emotions and
sentiments. Based on these scores, each tweet is classified into the most appropriate emotion
categories, which are amongst the universal emotions of Ekman [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ], namely joy, sadness, anger,
fear, disgust, and surprise, except surprise. The surprise emotion is dropped due to its ambiguous
appearance, indicating both positive and negative sentiments. At the same time, sentiment
categories comprise positive, negative or neutral, with scores. Watson API, Komprehend API,
and the Text2emotion python package are employed in the multi-model setup for annotation.
The required scores are obtained in the following manner:
• Watson predicts the sentiment label viz. positive, negative, or neutral, and those of
emotions, namely anger, fear, joy, disgust, and sadness, with their corresponding scores for
both, the sentiment (ranging from -1 to 1) and the emotion (ranging from 0 to 1) as shown
in Table 3 (based on sample tweets given in Table 2).
• The Komprehend labels data amongst the same three sentiments, i.e., positive, negative, or
neutral and their emotions amongst happy, sad, angry, fearful, excited, or bored, scores for
both of which range from 0 to 1.
• Text2emotion, on the other hand, scores the input only for their emotions amongst the
categories happy, anger, sad, surprise, and fear, scores of which range from 0 to 1.
          </p>
          <p>
            One diference between the score assignment of Watson and Komprehend is that the former
presents a single score labeling the sentiment of each tweet, whereas Komprehend returns
individual scores corresponding to each sentiment for the input data. However, the ultimate
labels being the same for both, those from Watson are considered as the final value for sentiment
classification. Note that the four emotions sadness, joy, fear, and anger are considered by each
of the above APIs, due to which the scores for the emotion disgust are considered directly from
these are converted into binary using the method discussed in section 4.3.2. The method of
arriving at the threshold for final annotation is discussed below.
4.3.1. Classification Threshold
In order to obtain the classification threshold, SemEval-2018 [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] E-c subtask, a benchmark
dataset, is employed. SemEval-2018 contains 10, 983 instances split into three subsets: training
set, validation set, and testing set, with 6838, 886, and 3259 instances, respectively.
SemEval2018 is labeled using the three APIs mentioned above, which provide real-valued scores for
each tweet. These real-valued scores are transformed into binary using a threshold value. This
threshold value, which is considered to be the InterQuartile Range (IQR), helps in making the
classification for proper inference of tweets while being a statistics robust to outliers with a
better representation of the amount of spread in the data than the range.
          </p>
          <p>Thus, the real-valued scores of the aforementioned multi-model setup corresponding to the
ground-truth are separated for both the classes (i.e., 0 and 1), for which the IQR is calculated
separately. The IQR corresponding to the ground-truth value labeled 1 opted for threshold.
Finally, these values for the emotions sadness, joy, fear, disgust, and anger are as follows: 0.26,
0.20, 0.25, 0.06, and 0.12, respectively.
4.3.2. Binary Annotation
The SemEval-2018 dataset is then classified based on the IQR for the presence (&gt;threshold)
and absence (&lt;threshold) of the particular emotion considered. Post this, majority rating is
employed for the final classification of emotions in tweets of the prepared corpus, i.e., two or
more models resulting with the same classification (presence/absence) for a tweet classifies
the latter as the same (presence/absence of the considered emotion). E.g., if tweet  1’s score of
Watson is 0, Komprehend is 1, and Text2emotion is 0 for emotion sadness as shown in Table 4,
then the final label is 0, i.e., emotion is not present shown in Table 5. In the vast majority of
cases, a tweet is labeled as 0 among five emotions. The curated Twitter-based corpus is then
fully labeled using these classification rules.</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Evaluation Metrics</title>
        <p>In multi-label tasks, the results can be partially correct or wrong. To capture the notion of
partial correctness, one can use metrics. A Jaccard index, equivalent to multi-label accuracy,
 


 


0
0
0
)
)
1

1

=</p>
        <p>∗ ∑
=
∗ ∑
  (
  (

)

)
   (  ) +   ( 

   (  ) +   ( 

Sample binary annotations obtained using the threshold in a multi-model setup</p>
        <p>IBM Watson NLU</p>
        <p>Komprehend Text Analysis</p>
        <p>Text2emotion
Sadness Joy</p>
        <p>Fear</p>
        <p>Disgust Anger Sadness</p>
        <p>Joy</p>
        <p>Fear</p>
        <p>Anger Sadness</p>
        <p>Joy</p>
        <p>Fear Anger
0
0
1
0
0
0
1
1
0
0
0
0
1
1
1
1
0
0
1
1
0

0
0
1
1
0
0</p>
        <p>Emotion
Sadness Joy</p>
        <p>Fear
where   is the set of the ground-truth labels, for instance,  ,   is the set of the predicted
labels for instances  , and  is the set of tweets. Additionally, label-based metrics have also
been utilized for performance evaluation. Mathematically, micro-average is ascertained by
aggregating micro-average precision ( 
using equations 2 and 3, respectively.
macro-average recall (</p>
        <p>), which are defined using equations 4 and 5, respectively.
which was denoted as   . It can be defined as the intersection size divided by the size of the
union of the ground-truth and predicted labels, is formulated as:
(1)
(2)
(3)
(4)
(5)
Similarly, macro-average is ascertained by aggregating macro-average precision ( 
) and
where   is the predicted label and  is the number of labels.</p>
        <p>Formally, f1-score is defined as the harmonic mean of   and   , and its value spans
between 0 to 1. Similarly, micro and macro f1-scores are computed using equation 6. The micro
f1-score gives equal weight to each testing instance, whereas macro f1-score gives equal weight
to each emotion. A higher value of f1-score indicates better multi-label classification results.
 1 /
= 2 ∗
 /
 /
∗  /
+  /
(6)</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Observations</title>
        <p>After calculating the threshold using a benchmark dataset, each emotion’s micro-average and
macro-average recall, precision, and f1-score were evaluated to validate the thresholds. A Jaccard
index, equivalent to multi-label accuracy of 0.52 and micro-average and macro-average f1-score
as 63% and 61% were obtained respectively.</p>
        <p>Table 6 shows the statistical details of our proposed dataset MINDS. The table indicates the
number of sentiment and emotion-classified tweets of six hashtags. Among those, #Omicron
contains more instances compared to other hashtags. Most of the hashtags have a large number
of negative tweets, except #OmicronInIndia and #Omicronvirus india, which have large neutral
tweets. At the same time, disgust emotion has large instances, except #Omicronvirus india
hashtag, which has 52 instances in emotion anger.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we have presented the curation of a large-scale, multi-labeled emotion and
sentiment classification dataset, MINDS, that contains COVID-19-related textual data. We used
a multi-model setup to classify the data into the most relevant emotion categories (annotation
labels), which are among the universal emotions of Ekman and sentiment categories. The
multimodel configuration included the models, such as the IBM Watson NLU API, the Komprehend
API, and the Text2emotion Python package for automatic data annotation. This resolved the
critical issue of time-consuming and labor-intensive dataset labeling. A benchmark dataset
(SemEval-2018, subtask E-c) was used to determine the classification threshold. We obtained
the Jaccard index as 0.52 and the micro-average and macro-average f1-score as 63% and 61%,
respectively. The MINDS dataset is publicly available at http://www.abulaish.com/ldsa/dataset
for research purposes. We are currently expanding the dataset to include additional emotions,
such as love, amusement, desire, admiration, and grief.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Yadollahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Shahraki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. R.</given-names>
            <surname>Zaiane</surname>
          </string-name>
          ,
          <article-title>Current state of text sentiment analysis from opinion to emotion mining, ACM Computing Surveys (CSUR) 50 (</article-title>
          <year>2017</year>
          )
          <fpage>1</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Acheampong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wenyu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Nunoo-Mensah</surname>
          </string-name>
          ,
          <article-title>Text-based emotion detection: Advances, challenges, and opportunities</article-title>
          ,
          <source>Engineering Reports</source>
          <volume>2</volume>
          (
          <year>2020</year>
          )
          <article-title>e12189</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jung</surname>
          </string-name>
          , Attnconvnet at semeval
          <article-title>-2018 task 1: Attention-based convolutional neural networks for multi-label emotion classification</article-title>
          , arXiv preprint arXiv:
          <year>1804</year>
          .
          <volume>00831</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Jabreel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <article-title>A deep learning-based approach for multi-label emotion classification in tweets</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>9</volume>
          (
          <year>2019</year>
          )
          <fpage>1123</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ekman</surname>
          </string-name>
          , Basic emotions,
          <source>Handbook of cognition and emotion 98</source>
          (
          <year>1999</year>
          )
          <fpage>16</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>W.</given-names>
            <surname>Ying</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>Improving multi-label emotion classification by integrating both general and domain knowledge</article-title>
          ,
          <source>in: Proceedings of the 5th Workshop</source>
          on Noisy User-Generated
          <string-name>
            <surname>Text</surname>
          </string-name>
          (W-NUT
          <year>2019</year>
          ),
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>316</fpage>
          --
          <lpage>321</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marujo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Karuturi</surname>
          </string-name>
          , W. Brendel,
          <article-title>Improving multi-label emotion classification via sentiment classification with dual attention transfer network</article-title>
          ,
          <source>in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Brussels, Belgium,
          <year>2018</year>
          , pp.
          <fpage>1097</fpage>
          -
          <lpage>1102</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohammad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bravo-Marquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Salameh</surname>
          </string-name>
          , S. Kiritchenko, SemEval
          <article-title>-2018 task 1: Afect in tweets</article-title>
          ,
          <source>in: Proceedings of the 12th international workshop on semantic evaluation, Association for Computational Linguistics</source>
          , New Orleans, Louisiana,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Demszky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Movshovitz-Attias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cowen</surname>
          </string-name>
          , G. Nemade, S. Ravi,
          <article-title>GoEmotions: A dataset of fine-grained emotions</article-title>
          , arXiv preprint arXiv:
          <year>2005</year>
          .
          <volume>00547</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Buechel</surname>
          </string-name>
          , U. Hahn,
          <article-title>Emobank: Studying the impact of annotation perspective and representation format on dimensional emotion analysis</article-title>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Mohammad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bravo-Marquez</surname>
          </string-name>
          , WASSA
          <article-title>-2017 shared task on emotion intensity</article-title>
          ,
          <source>arXiv preprint arXiv:1708.03700</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>