<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CryptoOpinionMining: A Comparative Analysis of Hierarchical and Flat Classification Models for Social Media Content</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rudra Roy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pritam Pal</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rishika Jha</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dipankar Das</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Engineering Science and Technology</institution>
          ,
          <addr-line>Shibpur, Howrah, West Bengal, 711103</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Engineering and Management</institution>
          ,
          <addr-line>Kolkata, West Bengal, 700091</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Jadavpur University</institution>
          ,
          <addr-line>Kolkata, West Bengal, 700032</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>RCC Institute of Information Technology</institution>
          ,
          <addr-line>Kolkata, West Bengal, 700015</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>With the recent popularity of cryptocurrency, social media such as Twitter and Reddit have become the main epicentre for cryptocurrency enthusiasts to share opinions and discussions about cryptocurrency. This paper presents a three-level hierarchical and single-level flat classification framework that can classify cryptocurrencyrelated unstructured social media opinions by leveraging the power of pre-trained transformer-based models. Along with opinion classification, we also proposed a question-answering framework utilizing the BiLSTM model, that can identify the most relevant social media comment regarding a cryptocurrency-related question from social media. By training the models with training data provided by FIRE 2024 CryptoQA shared task organizers and evaluating the models with test data, our best-performing opinion classification framework achieved a macro F1 score of 0.778 and 0.542 for Twitter and Reddit test data respectively which is also the best-performing framework in 'FIRE 2024 CryptoQA Task 1: Opinion Classification from CryptoCurrency related Social Media Posts'.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Opinion Classification</kwd>
        <kwd>Hierarchical Classification</kwd>
        <kwd>Question Answering</kwd>
        <kwd>Social Media Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the past few years, social networks have become one of the most important sources of information
within society and are also helpful in sharing opinions, and news in various fields. An example of a field
that has recently been actively discussed on social networks is cryptocurrency. The recently developed
digital currencies that include Bitcoin, Ethereum, and several other altcoins have sparked significant
discussions, controversies, and expectations on social media platforms. Cryptocurrencies are also highly
volatile and their value can change in the blink of an eye with reference to public opinion. This has
therefore created a very interesting factor that has not been easy to handle when doing natural language
processing (NLP) and sentiment analysis. The discussions related to cryptocurrencies on social media
contain technical terms, followed by self-referencing factors and dynamic narratives, unlike traditional
ifnancial markets where sentiment analysis has already been done.</p>
      <p>The specific problem we address in this research is the accurate understanding of opinions expressed
in cryptocurrency-related social media posts. This task is particularly challenging due to the diverse
nature of these posts, which can range from objective price discussions to subjective predictions, from
technical analyses to promotional content. We noticed that traditional language-sentiment analysis
techniques do not sufice when it comes to analyzing complex cryptocurrency language.</p>
      <p>Therefore, the main objective of the proposed study is to develop an accurate and eficient opinion
classification model for social media posts regarding cryptocurrencies. We should be able to create a
system that’s capable of categorizing information that falls under noise, fact, opinion, and fine-graded
sentiment within the fact and opinion type and a system that can give eficient answers related to
cryptocurrency.</p>
      <p>
        Along with the classification task, we focused on the diferent questions and doubts that arise among
the potential crypto investors and to check whether the comment with respect to the query is relevant
or not. The main contributions in this paper can be summarized as follows:
• We developed a novel three-level hierarchical classification framework for cryptocurrency-related
social media posts utilizing a series of fine-tuned RoBERTa-base[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] models to progressively
classify posts with increasing specificity.
• Next, we developed a single-level classification framework by employing one fine-tuned
XLM
      </p>
      <p>RoBERTa-base[2] model.
• Furthermore, we also developed a cryptocurrency question-answer framework by using</p>
      <p>BiLSTM[3].</p>
      <p>In the following sections, we divide the whole task into two parts: one is ‘Opinion classification
on cryptocurrency-related social media posts’ (Task 1) and another is ‘Question answering from
cryptocurrency-related social media posts’ (Task 2).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Dataset</title>
      <p>2.1. Task 1
This task utilizes two diferent datasets, consisting of cryptocurrency related Twitter and Reddit posts,
both provided by FIRE (Forum for Information Retrieval Evaluation) 2024 CryptoQA[4] shared-task
organizers. The datasets are created to be highly diverse in terms of opinions and types of content
regarding cryptocurrencies and thus provide a sound basis for this multi-level classification task. There
are 5000 opinions in the Reddit dataset and 4987 opinions in the Twitter dataset. The distribution of
opinions in both datasets is given in Table 1.</p>
      <sec id="sec-2-1">
        <title>2.1.1. Hierarchical Labeling Structure</title>
        <p>Each of the entries in the two datasets was labeled based on a three-tiered hierarchical classification
scheme, designed to capture the complexity of opinions regarding cryptocurrencies. The labelling
structure is described in Figure 1.</p>
        <p>Tweets/ Reedits
Level - 1</p>
        <p>Noise</p>
        <p>Objective</p>
        <p>Subjective
Level - 2
Level - 3</p>
        <p>Neutral
Sentiments</p>
        <p>Neutral</p>
        <p>Negative</p>
        <p>Positive
Questions</p>
        <p>Advertisement</p>
        <p>Miscellaneous
2.2. Task 2
The dataset for Task 2 was also provided by FIRE 2024 CryptoQA shared task organizers, which contains
a total of 7932 questions with corresponding comments. The dataset annotated the relevancy of a</p>
        <sec id="sec-2-1-1">
          <title>Level 1</title>
        </sec>
        <sec id="sec-2-1-2">
          <title>Level 2</title>
        </sec>
        <sec id="sec-2-1-3">
          <title>Level 3</title>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Category</title>
        <sec id="sec-2-2-1">
          <title>Noise</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>Objective</title>
        </sec>
        <sec id="sec-2-2-3">
          <title>Subjective</title>
        </sec>
        <sec id="sec-2-2-4">
          <title>Neutral</title>
        </sec>
        <sec id="sec-2-2-5">
          <title>Negative</title>
        </sec>
        <sec id="sec-2-2-6">
          <title>Positive</title>
        </sec>
        <sec id="sec-2-2-7">
          <title>Neutral Sentiments</title>
        </sec>
        <sec id="sec-2-2-8">
          <title>Questions</title>
        </sec>
        <sec id="sec-2-2-9">
          <title>Advertisements</title>
        </sec>
        <sec id="sec-2-2-10">
          <title>Miscellaneous</title>
          <p>comment for a specific question with 0 or 1, where 0 represents non-relevant and 1 represents relevant.
The training dataset contains five labels-title, MAIN, selftext, comment_body and relevance. It has a
total of 6787 non-relevant comments and 1145 relevant comments. The distribution of data is provided
in Figure 2.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>3.1. Task 1
This section presents a comprehensive description of our approach to opinion classification in social
media posts and question-answering related to cryptocurrencies such as text preprocessing, framework
development, training etc.</p>
      <p>For Task 1, we present a multi-level classifier and a single-level classifier that is capable of categorizing
these nuanced opinions about cryptocurrencies. We propose a methodology to tackle fine-grained
opinion classification, particularly featuring novel method that cater the peculiarities of cryptocurrency
discussions in social media platforms.</p>
      <sec id="sec-3-1">
        <title>3.1.1. Text Preprocessing</title>
        <p>In our approach to classifying cryptocurrency-related social media posts, we purposefully chose not to
use conventional text processing methods. This decision was based on our experimental observations
and the unique characteristics of the dataset. Our experiments revealed that any form of preprocessing,
including common techniques such as lowercasing, punctuation removal, or stop word elimination,
negatively impacted the model’s ability to identify the ’noise’ class accurately. Consequently, we opted
to use the raw text as input for our models. This approach preserves all original features of the posts,
including capitalization, punctuation, and special characters.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.1.2. Tokenization</title>
        <p>Our research employed two classification frameworks: a hierarchical classification model and a
singlelevel classification model. For both approaches, we utilized pre-trained models and their corresponding
tokenizers from the Hugging Face Transformers library.</p>
        <p>For the three-level hierarchical model, we used the RoBERTa base model and its tokenizer and for the
single-level approach, we employed the XLM-RoBERTa (XML-R) model and its corresponding tokenizer.</p>
        <p>For both frameworks, the tokenized sequences and their attention masks were used as input for
the respective classification models. This dual approach allowed us to compare the efectiveness of
hierarchical versus single-level classification for cryptocurrency-related social media posts.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.1.3. Proposed Framework</title>
        <p>We developed two frameworks to accomplish the hierarchical classification framework, leveraging the
power of the transformer-based RoBERTa base model and XLM-RoBERTa base.</p>
        <p>The RoBERTa-base model is a transformer-based architecture with 12 encoder layers, 768 hidden
units, and 12 attention heads, totalling 125 million parameters. It uses a vocabulary of 50,265 tokens and
supports a maximum sequence length of 512 tokens. Unlike BERT, RoBERTa was trained on a larger
and more diverse dataset (160GB of text from Common Crawl), with dynamic masking and without the
Next Sentence Prediction (NSP) objective. It was optimized with large batch sizes and a higher learning
rate, resulting in improved performance across various NLP tasks such as text classification, named
entity recognition, and question answering.</p>
        <p>On the other hand, the XLM-RoBERTa is an improved version of XLM that builds upon the RoBERTa
model and is trained in 100 languages. The overall model framework is depicted in Figure 3.</p>
        <p>3-level classification framework : The flow diagram for the 3-level hierarchical classification
framework is provided in Figure 3(a) where we employed separate ‘RoBERTa-base’ models for each
level of classification, leveraging its pre-trained knowledge for efective transfer learning in the domain
of cryptocurrency-related social media content.</p>
        <p>Single-level classification framework : In this scheme, we first convert the hierarchical
annotation to a single-level annotation: Noise, Objective, Negative, Positive, Neutral Sentiments, Questions,
Advertisements, and Miscellaneous. Then, we employed one ‘XLM-RoBERTa-base’ model to classify
the posts into one of the 8 labels. Figure 3(b) provides the overall flow diagram for the single-level
classification framework.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.1.4. Training</title>
        <p>For 3-level classification, the training process involves training of three independent models, each
trained on specific subsets of the data:
1. Model 1 Training:
2. Model 2 Training:
• Input: The entire training dataset.
• Output: Classification into Noise, Objective, and Subjective categories.</p>
        <p>Texts
Preprocessing &amp;</p>
        <p>Tokenization
XLM-RoBERTa-base</p>
        <p>Dropout = 0.2
Dense Layer (hidden units: 128;
activation: ReLU)</p>
        <p>Neu
Sen
(b)
Noise Obj Neg Pos</p>
        <p>Ques</p>
        <p>Advt</p>
        <p>Misc
Preprocessing &amp;</p>
        <p>Tokenization</p>
        <p>RoBERTa-base
Noise</p>
        <p>Objective Subjective</p>
        <p>RoBERTa-base
Neutal
3. Model 3 Training:
• Input: Posts labelled as "neutral" in the training dataset.
• Output: Fine-grained classification into Neutral-sentiment, Questions, Advertisements, and</p>
        <p>Miscellaneous categories.</p>
        <p>In the case of single-level classification, we provided the entire training dataset as an input and
classiifed the output into Noise, Objective, Negative, Positive, Neutral Sentiments, Questions, Advertisements,
and Miscellaneous categories. For both 3-level and single-level frameworks the output layers used
softmax as their activation function.</p>
        <p>As no validation split was provided in the ‘CryptoQA’ shared task data, we randomly split the train
data into a 4:1 ratio, where 80% of data was taken as training split and 20% of data was considered as
validation split to examine the performances at the time of training.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.1.5. Tuning Parameter</title>
        <p>At the time of the training process, we fine-tune the parameters of the RoBERTa and XLM-RoBERTa
models to achieve the best result possible in the validation dataset. For the 3-level hierarchical framework,
the models were trained up to 3 epochs with a batch size of 16. The learning rate was taken as 2e-5
with AdamW[5] optimizer and weight decay of 0.01.</p>
        <p>In the case of the single-level classification framework, the proposed framework was trained up to 5
epochs with a batch size of 16 and the learning rate was taken as 2e-5 with Adam[6] optimizer.
3.2. Task 2
Regarding Task 2, we developed a framework to accomplish the question-answering task using the
BiLSTM model. As mentioned in Section 2.2, the comments for questions were annotated with relevant
and non-relevant in the original dataset, thus we treat this task as a relevancy classification task where
we aim to classify a comment as relevant or non-relevant.</p>
        <p>The BiLSTM has long-term memory which allows memorizing important information from long
sequences of text. Also, the bidirectional capability enables the model to capture both past and
future contexts within a sequence or text, enabling the model to better language understanding than
unidirectional LSTMs.</p>
        <p>Question</p>
        <p>Tokenization
Word Embedding</p>
        <p>Layer</p>
        <p>BiLSTM
(Hidden Units = 64)</p>
        <p>Concatinate</p>
        <p>Comments</p>
        <p>Tokenization
Word Embedding</p>
        <p>Layer</p>
        <p>BiLSTM
(Hidden Units = 64)</p>
        <p>The overall model flow diagram is depicted in Figure 4 where we passed the word embedding
representation of the tokenized sequence of questions and the corresponding comments into two
separate BiLSTM layers with 64 hidden units as a key input. Next, the output of two BiLSTM layers
was concatenated and passed to the final output layer to identify whether the comment was relevant to
the question or not. The output layer used two hidden units with softmax as an activation function.</p>
        <p>To train the proposed framework, we split the training data as same as discussed in 3.1.4. We selected
CategoricalCrossentropy as a loss function with a learning rate of 0.001. The optimizer was taken as
Adam, and the model was trained up to 5 epochs with 128 batch size.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation and Result</title>
      <p>This section presents the evaluation and results of our opinion classification task (Task 1) and
questionanswering task (Task 2) for cryptocurrency-related social media posts. We conducted extensive testing
on a validation dataset to assess the framework’s efectiveness. We provide a detailed account of the
prediction process and discuss the outcomes observed on the test datasets provided by FIRE 2024
CryptoQA task organizers.
4.1. Prediction
Three-level hierarchical framework: The prediction process followed a hierarchical structure,
mirroring our three-level training architecture:
1. Level 1 Classification: All posts in the test dataset were initially processed by our first-level
model, which categorized them into three classes: Noise, Objective, and Subjective.
2. Level 2 Classification: Posts classified as "Subjective" by the first model were then fed into
the second-level model. This model further categorized these posts into Neutral, Negative, or
Positive sentiments.</p>
      <p>Single-level classification framework: In the case of single-level classification, the prediction strategy
was relatively simple where all posts in the test dataset were passed as input to the framework and
categorized into eight classes: Noise, Objective, Negative, Positive, Neutral Sentiments, Questions,
Advertisements, and Miscellaneous. The results for these two approaches are provided in Table 2.</p>
      <sec id="sec-4-1">
        <title>Run file Name</title>
        <p>run-1.csv
run-2.csv</p>
      </sec>
      <sec id="sec-4-2">
        <title>Summary</title>
        <sec id="sec-4-2-1">
          <title>Hierarchical Framework</title>
        </sec>
        <sec id="sec-4-2-2">
          <title>Single-level Framework</title>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>Twitter Data (Macro-f1)</title>
        <p>0.725
0.778</p>
      </sec>
      <sec id="sec-4-4">
        <title>Reddit Data (Macro-f1)</title>
        <p>0.518
0.542
4.2. Result
From Table 2 it is observed that the single-level classification framework achieved the best result with
F1 scores of 0.778 and 0.542 for Twitter and Reddit test data respectively. However, the performance
of the hierarchical classification framework slightly downgraded to 6.94% and 2.4% for Twitter and
Reddit data respectively compared to the single-level framework. In addition, our proposed single-level
framework achieved the best score than the performances of other shared task participants in ‘FIRE
2024 CryptoQA Task 1: Opinion Classification from CryptoCurrency related Social Media Posts’.</p>
        <p>Regarding Task 2, our proposed framework didn’t perform well and provided a result of
0.14651(Macrof1).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>In this paper, we presented a novel hierarchical approach and single-level approach to classifying
cryptocurrency-related social media posts. Our proposed three-level hierarchical approach and
singlelevel approach performed well and among them, the single-level approach provided the best result in
classifying cryptocurrency-related social media posts.</p>
      <p>While our model shows promising result, there are areas for further improvement and research.
Future Work could focus firstly on incorporating more advanced natural language processing techniques
to better capture context and nuance. Secondly on addressing potential error propagation through the
hierarchical levels. These advancements may improve our categorization accuracy even further.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors do not use any Generative AI tool.
1Evaluated by shared task organizers.
[2] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott,
L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at scale, CoRR
abs/1911.02116 (2019). URL: http://arxiv.org/abs/1911.02116. arXiv:1911.02116.
[3] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. 9 (1997) 1735–1780. URL:
https://doi.org/10.1162/neco.1997.9.8.1735. doi:10.1162/neco.1997.9.8.1735.
[4] K. R. K. G. Sougata Sarkar, Gourav Sen, Understanding cryptocurrency related opinions and
questions from social media posts (cryptoqa 2024), in: Forum of information retrieval (FIRE), 2024.</p>
      <p>URL: https://sites.google.com/view/cryptoqa-2024.
[5] I. Loshchilov, F. Hutter, Decoupled weight decay regularization, 2019. URL: https://arxiv.org/abs/
1711.05101. arXiv:1711.05101.
[6] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, 2017. URL: https://arxiv.org/abs/
1412.6980. arXiv:1412.6980.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized BERT pretraining approach</article-title>
          , CoRR abs/
          <year>1907</year>
          .11692 (
          <year>2019</year>
          ). URL: http://arxiv.org/abs/
          <year>1907</year>
          .11692. arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>