<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the FIRE 2023 Track: Artificial Intelligence on Social Media (AISoMe)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Soham Poddar</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Moumita Basu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kripabandhu Ghosh</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Saptarshi Ghosh</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Amity University</institution>
          ,
          <addr-line>Kolkata</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Indian Institute of Science Education and Research</institution>
          ,
          <addr-line>Kolkata</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Indian Institute of Technology</institution>
          ,
          <addr-line>Kharagpur</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The COVID-19 pandemic showed the importance of vaccination at a large scale. However, quite often people expressed diferent concerns they had towards vaccines which made them hesitant to take them. Some people were concerned about the potential side-efects of vaccines, while some believed that the vaccines were not necessary due to the disease being mild. These concerns were frequently shared on social media sites such as Twitter. The FIRE 2023 AISoMe track focused on identifying these specific concern(s) that people have towards vaccines from tweets, as a 12-class multi-label classification task.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Twitter</kwd>
        <kwd>microblogs</kwd>
        <kwd>COVID-19</kwd>
        <kwd>vaccine concerns</kwd>
        <kwd>tweet</kwd>
        <kwd>multi-label classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>Country</title>
      </sec>
      <sec id="sec-1-2">
        <title>Inefective</title>
      </sec>
      <sec id="sec-1-3">
        <title>Ingredients</title>
      </sec>
      <sec id="sec-1-4">
        <title>Mandatory</title>
      </sec>
      <sec id="sec-1-5">
        <title>Pharma</title>
      </sec>
      <sec id="sec-1-6">
        <title>Political</title>
      </sec>
      <sec id="sec-1-7">
        <title>Religious</title>
      </sec>
      <sec id="sec-1-8">
        <title>Rushed</title>
      </sec>
      <sec id="sec-1-9">
        <title>Side-efect</title>
      </sec>
      <sec id="sec-1-10">
        <title>Unnecessary</title>
      </sec>
      <sec id="sec-1-11">
        <title>None</title>
        <p>Deeper Conspiracy – The tweet suggests some deeper conspiracy, and not just that
the Big Pharma want to make money (e.g., vaccines are being used to track people,
COVID is a hoax).</p>
        <p>Country of origin – The tweet is against some vaccine because of the country where
it was developed/manufactured.</p>
        <p>Vaccine is inefective – The tweet expresses concerns that the vaccines are not
efective enough and are useless.</p>
        <p>Vaccine Ingredients/technology – The tweet expresses concerns about the
ingredients present in the vaccines (eg. fetal cells, chemicals) or the technology used (e.g.,
mRNA vaccines can change your DNA).</p>
        <p>Against mandatory vaccination – The tweet suggests that vaccines should not be
made mandatory.</p>
        <p>Against Big Pharma – The tweet indicates that the Big Pharmaceutical companies
are just trying to earn money, or is against such companies in general because of
their history.</p>
        <p>Political side of vaccines – The tweet expresses concerns that the
governments/politicians are pushing their own agenda though the vaccines.</p>
        <p>The tweet opposes vaccines due to religious reasons.</p>
        <p>Untested/Rushed Process – The tweet expresses concerns that the vaccines have
not been tested properly or that the published data is not accurate.</p>
        <p>Side Efects/Deaths – The tweet expresses concerns about the side efects of the
vaccines, including deaths.</p>
        <p>The tweet indicates vaccines are unnecessary, or that alternate cures are better.</p>
        <p>No specific reason stated in the tweet, or some reason other than the given ones.
needs diferent persuasion and reasoning than someone who is hesitant to take vaccines due to
the corruption in politics.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. The Datasets and Evaluation Metrics</title>
      <p>This section describes the train and test datasets used for the track, and also describes the
metrics used for evaluating the submitted runs/methods over the test dataset.</p>
      <sec id="sec-2-1">
        <title>2.1. The training / validation dataset</title>
        <p>
          For training and validation, we utilize the ‘CAVES’ dataset from our prior work [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. This dataset
contains 9,921 anti-vaccine tweets about COVID vaccines (that were posted during 2020-21),
where each tweet has been labelled with one or more of the 12 classes (given in Table 1) by
human annotators. Table 2 shows some examples of tweets from this dataset, along with their
labels. More details about the data collection and annotation process of the CAVES dataset can
be found in the prior work [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <sec id="sec-2-1-1">
          <title>Tweet Excerpt Labels</title>
          <p>STOP TAKING TOXIC VAX and expose COVID hoax and murders with mor- ingredients,
phine and ventillators. there is No covid! conspiracy,
unnecessary
Please don’t push vaccine on us make it voluntary. We don’t trust anything to pharma,
do with Bill Gates pushing their agenda of vaccine chips!! mandatory,
ingredients
side-efect,
rushed
political,
country
religious
The reason insurance companies won’t pay out if you experience the inevitable
adverse reactions, including death is because it is an “Experimental Vaccine”
Would you want the Russian vaccine? If not, you shouldn’t want one that’s
been pushed through for political reasons either.</p>
          <p>Catholic leaders are advising Catholics that the COVID-19 vaccine from Johnson
&amp; Johnson is "morally compromised"
I’m NOT taking your damn vaccine. Keep your conspiracy out of my veins!
none</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. The evaluation dataset</title>
        <p>For evaluation, we introduce a new dataset, developed in a similar fashion as the CAVES dataset.
This dataset contains 486 tweets labelled into the same 12 classes. However, these tweets are
not only about COVID vaccines but also about other types of vaccines (e.g., MMR vaccine, Flu
vaccine), from both the COVID-era as well as pre-COVID times.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Evaluation method</title>
        <p>The participating teams were asked to develop models for the multi-label classification task,
which were trained on the CAVES dataset and whose performance will be measured over the
evaluation dataset described above. Each participating team were able to submit up to 3 runs,
e.g, from models with diferent hyperparameters. They were also free to use other attributes of
the tweets (apart from the text) if they wanted, along with other publicly available datasets for
training their models.</p>
        <p>The submitted runs by the participants were ranked based on their performances on the
evaluation dataset. The standard classification metric of Macro-F1 score on the 12 diferent
classes was used for evaluation.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methods - Submitted runs</title>
      <p>
        In the AISoMe track, 22 teams participated this year, and as many as 48 runs were submitted.
Most of the teams used NLP pre-processing techniques and a few teams used TF-IDF Vectorizer
to extract features. Among the classification techniques, fine-tuned transformer models such as
BERT, RoBERTa and Covid-Twitter-BERT (CT-BERT) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] are utilized mostly by the participating
teams. Some teams also employed LLM-based models such as GPT 3.5 and GPT2LMheadmodel.
Neural network-based classifiers (MLP) and traditional classifiers (such as Multinomial Naïve
      </p>
      <sec id="sec-3-1">
        <title>Team Id Overview of method</title>
        <p>AKCSIT Fine tuned CT-BERT
DatawIz Fine tuned CT-BERT
IISERBPR-NLP Fine tuned BERT with best threshold
DSIRC Fine tuned CT-BERT
Cognitive Coders DeBERTa Large Fine-tuned
TextTitans BERT-large uncased
PICT CL LAB Group 1 RoBERTa based model
SSN_IT_Team01 RoBERTa based model
LLM-geeks Intersection of the predictions from DeBERTa and</p>
        <p>RoBERTa
SSN_IT_Team02 RoBERTa based model
Data Warriors LLM based model (GPT 3.5)
Alpha Intellect AI BERT based uncased
S3 Endeavour GPT2LMheadmodel
PICT CL Lab Support Vector Machine (SVM) model
ZSL Decision Tree Classifier + Multi Output Classifier
C3 RoBERTa based sentence classification
APS AI&amp;ML Multinomial Naive Bayes, Multi-Output Classifier
Social Media Data Analy- Classifier chain with Support Vector Machines
sis Team
OpenVax Multi-Layer Perceptron (MLP) model
RANJAN A-MONKA- CNN-BiLSTM model with GLOVE embeddings
RESEARCH
Swastik Anupam TFIDF-Neural Net
IIIT_SURAT SVM models within the Classifier Chain
0.57
0.55
0.54
0.46
0.45
0.43
0.41
0.39
0.38
0.37
0.29
0.25
0.07
Bayes and Support Vector Machines, Multi-Output Classifiers) are also used by some of the
teams. The summary of the techniques is reported in Table 3. It is observed than fine-tuned
CT-BERT models have outperformed all traditional and other neural classifiers for our task.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Directions</title>
      <p>
        The FIRE 2023 AISoMe track compared the performance of various methods for identifying the
specific anti-vaccine concerns from tweets. We hope that the test collections developed in this
track will be utilized by the research community in the development of better models for this
important task in future. It can be noted that the CAVES dataset also contains explanations
for the class labels, as well as summaries for the diferent anti-vaccine classes (details in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]).
These data can also be utilized for tasks such as explainable tweet classification and tweet
summarization in future.
The track organizers thank all the participants for their interest in this track, and the FIRE
authorities for their support in running the track.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Poddar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mondal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Misra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <article-title>Winds of change: Impact of covid-19 on vaccine-related opinions of twitter users</article-title>
          ,
          <source>in: Proceedings of the International AAAI Conference on Web and Social Media</source>
          , volume
          <volume>16</volume>
          , AAAI Press,
          <year>2022</year>
          , pp.
          <fpage>782</fpage>
          -
          <lpage>793</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.-A.</given-names>
            <surname>Cotfas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Delcea</surname>
          </string-name>
          , I. Roxin,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ioanăş</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Gherai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tajariol</surname>
          </string-name>
          ,
          <article-title>The longest month: analyzing covid-19 vaccination opinions dynamics from tweets in the month following the ifrst vaccine announcement</article-title>
          ,
          <source>Ieee Access</source>
          <volume>9</volume>
          (
          <year>2021</year>
          )
          <fpage>33203</fpage>
          -
          <lpage>33223</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Poddar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Samad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          , S. Ghosh,
          <article-title>CAVES: A dataset to facilitate Explainable Classification and Summarization of Concerns towards COVID Vaccines</article-title>
          ,
          <source>in: Proceedings of the International ACM SIGIR Conference</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Salathé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Kummervold</surname>
          </string-name>
          ,
          <article-title>Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter</article-title>
          , arXiv preprint arXiv:
          <year>2005</year>
          .
          <volume>07503</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>