<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ChatGPT to Detect Subjective Statements and Political Bias</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mehmet Deniz Türkmen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gökalp Coşgun</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mucahid Kutlu</string-name>
          <email>mkutlu@etu.edu.tr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Error (MAE).</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>TOBB University of Economics and Technology</institution>
          ,
          <addr-line>Ankara</addr-line>
          ,
          <country country="TR">Turkey</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <abstract>
        <p>Information has been referred to as the “oil” of the 21st century, emphasizing its immense importance. However, it also has the potential to pose significant risks and hazards if it is not correct. Hence, it is imperative to reduce the spread of misinformation. In this paper, we present our participation in Task 2 (i.e., detecting subjective tasks) and Task 3A (i.e., detecting political bias in news articles) of CLEF CheckThat! 2023 which focuses on reducing the spread of misinformation. We propose utilizing ChatGPT for these classification tasks and explore zero-shot and few-shot classification using ChatGPT. While the performance of our approach varies across diferent languages in Task 2, we are ranked 3 rd on the German dataset with 0.71 macro  1 score. In Task 3A, we are ranked 2nd with 0.646 Mean Absolute</p>
      </abstract>
      <kwd-group>
        <kwd>fact-checking</kwd>
        <kwd>subjectivity</kwd>
        <kwd>political bias</kwd>
        <kwd>shared task</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The Internet has served as the primary tool for accessing information over the past two
decades. Alongside its ability to facilitate the rapid dissemination of information, it
provides a wide range of information sources thanks to its inclusive nature, allowing
contributions from anyone. While this accessibility makes reaching information extremely
convenient, it also raises concerns about the reliability and quality of the information
available.</p>
      <p>
        As a result, ensuring the trustworthiness of information has become an
important research direction [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>The process of verifying factuality is not a straightforward task and often requires
multiple steps to reach a conclusive result. One crucial step in fact-checking is to decide
whether a statement requires fact-checking or not. For instance, we first need to detect
the subjectivity of a statement as there is no need to fact-check personal opinions and
beliefs. In addition, when content creators exhibit personal biases, the reliability of the
information necessitates further investigation. In this regard, we focus on subjectivity and
Greece</p>
      <p>
        nEvelop-O
(M. Kutlu)
political bias analysis, specifically addressing Task 2 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and Task 3A [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] in CheckThat!
Lab at CLEF 2023 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Subjectivity analysis entails identifying whether a sentence is subjective or objective,
while political bias analysis involves detecting left, right, and center opinions within an
article. In our study, we propose the utilization of ChatGPT for both tasks. Renowned
for generating high-quality responses across various domains, we think that it can be
also utilized for classification problems, as it eliminates the need for training or tuning
processes. Thus, we adapt ChatGPT to the classification tasks in CheckThat! 2023 Lab
and evaluate its performance in zero-shot and few-shot settings.</p>
      <p>Our findings can be summarised as follows: 1) With the exception of the Dutch and
German datasets, ChatGPT falls behind the baseline method in Task 2. 2) ChatGPT is
more efective in detecting the political bias than subjectivity. Our approach in Task
3A is ranked 2 . 3) Interestingly, ChatGPT performs better in zero-shot classification
compared to the few-shot setting in both tasks.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Subjectivity Detection</title>
        <p>Subjectivity analysis has been extensively explored by prior work. Numerous studies
have dedicated eforts towards subjectivity detection, recognizing its importance as a
preliminary step in other NLP tasks [6, 7]. Chaturvedi et al. [7] and Montoyo et al. [8]
specifically concentrate on subjectivity detection to enhance the quality of sentiment
analysis. In the case of Sixto et al. [6], subjectivity analysis serves as the primary step
in polarity detection tasks. Rilof [9] exploits subjectivity to enhance the precision of
information extraction systems. Wilson et al. [10] design a comprehensive system for
detailed subjectivity analysis. Apart from subjective text classification, their system also
aims to extract textual elements that contribute to subjectivity.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Political Bias</title>
        <p>An extensive amount of research has been conducted to investigate the presence of
political bias in news articles, shedding light on its impact and various dimensions [11, 12].
Content based bias detection is generally conducted at two levels of granularity: the
article level [13] and the sentence level [14]. Lin et al. [15] use statistical methods such
as Naive Bayes and SVM to analyze political orientation at the document and sentence
levels as a precursor to this problem. Chen et al. [12] demonstrate the detection of media
bias using sequential models and illustrate the possibility to reveal the bias at diferent
granularity levels. More recently, Hong et al. [11] propose a more robust and general
multi-head attention by overcoming the issue of domain dependency.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Approach</title>
      <p>In this section, we describe how we employ ChatGPT for zero-shot and few-shot
classification. Firstly, we convert classification tasks into queries in order to interact with
ChatGPT. These queries comprise two components: a command part and a content
part. While the content part corresponds to the text to be classified, the command
part specifies the desired output based on the content. In our scenario, the content
corresponds to a sentence (Task 2) or a news article (Task 3A). The command presents
labels and instructs ChatGPT to classify the content based on the labels. We use separate
commands for the two tasks we participate in.
1) Theresa May made this more explicit: ‘Socialism is about levelling down. (Objective)
2) But he said the risk of a violent backlash had grown this year. (Objective)
3) Many originate with educational, recreational and sociological enthusiasts [...]. (Objective)
4) This week, authorities in Belgrade put a stop to EuroPride, [...]. (Objective)
5) “Normally, the majority opinion would speak for itself. The decision is [...]. (Objective)
6) White House officials have touted their efforts to cut down on the paperwork [...]. (Subjective)
7) One day’s work in every four belongs to government. (Subjective)
8) The battle to set our economic machine in motion in this emergency takes new [...].
(Subjective)
9) An indebted state-owned bus company in Lanzhou, the capital of Gansu [...] (Subjective)
10) Still, many of the nation’s 3.1 million public-school teachers have become [...]. (Subjective)
■</p>
      <sec id="sec-3-1">
        <title>Command ■</title>
        <p>In the context of few-shot classification, our objective is to boost classification
performance by providing ChatGPT with groundtruth classification examples, which consist
of pairs of statements and their corresponding labels. To determine which examples
to provide in Task 2, we initially conduct zero-shot classification on the training data.
Subsequently, we randomly choose five misclassified training samples from each class and
utilize them as ground-truth examples for few-shot classification on the validation and
test data. In Task 3A, we follow a slightly diferent approach such that we randomly
select five examples from the whole training set (i.e., not only misclassifed ones) for
each label and provide these samples for few-shot classification. For each prediction, we
re-sample the examples. Table 1 and Table 2 illustrate how we utilize ChatGPT for
zero-shot and few-shot classification.</p>
        <p>One of the problems we face during the utilization of ChatGPT for classification
the same text with diferent titles. We encountered the same issue with 45 samples in
the test set. For these articles, we combine the title and content sections in our queries.
In addition, in some cases ChatGPT returns “right-center” and “left-center” responses
while we do not have these labels. In these cases, we select the majority class in the
train-validation data.</p>
        <p>Zero-shot and few-shot political bias prediciton using ChatGPT for Task 3A</p>
        <p>Query
Choose a political leaning for this text, answer with only one word(left,cen ter,right): ”After
2½ years of civil war in Syria, President Barack Obama’s larger policy is in disarray even as his
administration, with help from Russia, averted a military...”
Choose a political leaning for this text, answer with only one word(left,cen ter,right): ”After
2½ years of civil war in Syria, President Barack Obama’s larger policy is in disarray even as his
administration, with help from Russia, averted a military...”
Here are some examples of the classes:
1 - “Senate Judiciary Committee Chairman Lindsey Graham (R-S.C.) said [...].” :
(Center)
2 - “The Senate confirmed Antony Blinken to be President Biden’s Secretary of State [...].” :
(Center)
3 - “President Biden directed the Department of Energy on Tuesday to release 50 [...].” : (Center)
4 - “News ‘Every Blessing To Her And Her Film’: Shia LaBeouf Reacts After Olivia [...].” : (Center)
5 - “News What to expect as Democrats retain the Senate for the next two years [...].” : (Center)’
6 - “As the House of Representatives is dragged closer to a vote on authorizing [...].” :(Left)
7 - “ANALYSIS Democrats came into the 2020 Senate elections as slight favorites [...].” : (Left)
8 - “News Iranian Women Are Burning Their Hijabs And Cutting Their Hair [...].” : (Left)
9 - “The emails to H don’t contain a smoking gun, at least not yet [...].” : (Left)
10 - “ANALYSIS The growing Trump-Biden war over China, explained [...].” : (Left)
11 - “Yesterday during a press conference, Attorney General Eric Holder [...].” : (Right)
12 - “He was not on the ballot, but former President Trump was one of [...].” : (Right)
13 - “The scandals facing the White House particularly the Benghazi [...].” : (Right)
14 - “News Hearing aids now available over the counter for first time [...].” : (Right)
15 - “The Supreme Court delivered a dramatic change to abortion jurisprudence [...].” : (Right)
■</p>
      </sec>
      <sec id="sec-3-2">
        <title>Command ■</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>results.</p>
      <p>In this section, we first describe our experimental setup. Subsequently, we present our</p>
      <sec id="sec-4-1">
        <title>4.1. Experimental Setup</title>
        <p>The dataset created for Task 2 comprises 6 languages and the multilingual setting. Table 3
presents language-specific label distribution. For Task 3A, the training and development
datasets consist of JSON files that contain title, content and label data. The data
distributions of the train and validation sets are presented in Table 4. Due to time
limitation and slow execution of ChatGPT results, we could use only 1,000 samples from
the validation set in Task 3A. We use Turkish queries for Turkish dataset and English
queries for the other languages.</p>
        <p>In Task 2, we report accuracy and macro  1 scores. In Task 3A, we report mean
absolute error (MAE), i.e., the oficial metric of the lab, and accuracy, precision, recall,
and  1. Due to the imbalance label distribution, we present the weighted scores for all
metrics except MAE.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Experimental Results</title>
        <p>4.2.1. Task 2
Table 5 presents zero-shot and few-shot classification performances on the validation
dataset. The evaluation of few-shot classification does not include multilingual data,
mainly because the data samples which can belong to any of the six languages makes it
challenging to select the appropriate ground-truth examples. As can be seen in the table,
the zero-shot classification outperforms the few-shot classification, except for Italian
and Turkish. Thus, we submitted few-shot predictions for the test data in these two
languages, while utilizing zero-shot predictions for the others.</p>
        <p>Table 6 shows the scores and rankings obtained from the submitted predictions for
each language on the test dataset. In comparison to the validation scores, the test scores
are generally lower, with the exception of Dutch. We achieve the best ranking in German.
While the subjectivity detection performance of ChatGPT is impressive, its performance
is lower than most of the participants.
4.2.2. Task 3A
Table 7 presents results for the validation set. We observe that zero-shot outperforms
the few-shot classification. Thus, we submit zero-shot classification results for the test
data. The results on the test set are shown Table 8. Our ranking in Task 3A is better
than our ranking in Task 2, suggesting that ChatGPT is more suitable for political bias
detection than subjectivity.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we present our participation in Task 2 and Task 3A of CLEF-2023 Check
That! Lab. For both tasks, we utilize ChatGPT with two diferent approaches: zero-shot
and few-shot classification.</p>
      <p>In Task 2, we could not surpass the baseline in five languages (Multilingual, Turkish,
Arabic, Italian, and English) with the approach we employed. We achieved the best
results in Dutch and German languages, where we outperformed the baseline and secured
the 3 position in the rankings.</p>
      <p>for Arabic, Dutch, English, German, Italian, and Turkish languages, respectively. In
Task 3A, we are ranked 2nd (out of 5 groups).</p>
      <p>In the future, we plan to explore how ChatGPT can be utilized more efectively in
these tasks. In particular, we plan to investigate the impact of how prompts are written,
the size of the samples provided in the few-shot approach, and how to select the samples.
C. Gurrin, U. Kruschwitz, A. Caputo (Eds.), Advances in Information Retrieval,
Springer Nature Switzerland, Cham, 2023, pp. 506–517.
[6] J. Sixto, A. Almeida, D. López-de Ipiña, An approach to subjectivity detection on
twitter using the structured information, in: Computational Collective Intelligence:
8th International Conference, ICCCI 2016, Halkidiki, Greece, September 28-30, 2016.</p>
      <p>Proceedings, Part I 8, Springer, 2016, pp. 121–130.
[7] I. Chaturvedi, E. Ragusa, P. Gastaldo, R. Zunino, E. Cambria, Bayesian network
based extreme learning machine for subjectivity detection, Journal of The Franklin
Institute 355 (2018) 1780–1797.
[8] A. Montoyo, P. Martínez-Barco, A. Balahur, Subjectivity and sentiment analysis:
An overview of the current state of the area and envisaged developments, Decision
Support Systems 53 (2012) 675–679.
[9] E. Rilof, Exploiting subjectivity classification to improve information extraction
ellen rilof janyce wiebe william phillips (2005).
[10] T. Wilson, P. Hofmann, S. Somasundaran, J. Kessler, J. Wiebe, Y. Choi, C. Cardie,
E. Rilof, S. Patwardhan, Opinionfinder: A system for subjectivity analysis, in:
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations, 2005, pp. 34–35.
[11] J. Hong, Y. Cho, J. Jung, J. Han, J. Thorne, Disentangling structure and style:
Political bias detection in news by inducing document hierarchy, arXiv preprint
arXiv:2304.02247 (2023).
[12] W.-F. Chen, K. Al-Khatib, H. Wachsmuth, B. Stein, Analyzing political bias
and unfairness in news articles at diferent levels of granularity, arXiv preprint
arXiv:2010.10652 (2020).
[13] Y. Liu, X. F. Zhang, D. Wegsman, N. Beauchamp, L. Wang, Politics: pretraining
with same-story article comparison for ideology prediction and stance detection,
arXiv preprint arXiv:2205.00619 (2022).
[14] Y. Lei, R. Huang, L. Wang, N. Beauchamp, Sentence-level media bias analysis
informed by discourse structures, in: Proceedings of the 2022 Conference on
Empirical Methods in Natural Language Processing, 2022, pp. 10040–10050.
[15] W.-H. Lin, T. Wilson, J. Wiebe, A. G. Hauptmann, Which side are you on?
identifying perspectives at the document and sentence levels, in: Proceedings of the
Tenth Conference on Computational Natural Language Learning (CoNLL-X), 2006,
pp. 109–116.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Kartal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kutlu</surname>
          </string-name>
          ,
          <article-title>Re-think before you share: A comprehensive study on prioritizing check-worthy claims</article-title>
          ,
          <source>IEEE transactions on computational social systems 10</source>
          (
          <year>2022</year>
          )
          <fpage>362</fpage>
          -
          <lpage>375</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Rashkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Jang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Volkova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <article-title>Truth of varying shades: Analyzing language in fake news and political fact-checking</article-title>
          ,
          <source>in: Proceedings of the 2017 conference on empirical methods in natural language processing</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>2931</fpage>
          -
          <lpage>2937</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Galassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          , A. B.-C. no,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kutlu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Antici</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Köhler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Korre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Leistra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Muti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Turkmen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          , W. Zaghouani,
          <article-title>Overview of the CLEF-2023 CheckThat! lab task 2 on subjectivity in news articles</article-title>
          , in: Working Notes of CLEF 2023-
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          , CLEF '
          <year>2023</year>
          , Thessaloniki, Greece,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Da San Martino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Nandi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Azizov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF-2023 CheckThat! lab task 3 on political bias of news articles and news media</article-title>
          , in: Working Notes of CLEF 2023-
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          , CLEF '
          <year>2023</year>
          , Thessaloniki, Greece,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          , G. Da San Martino, T. Elsayed,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Nandi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Cheema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Azizov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <article-title>The clef-2023 checkthat! lab: Checkworthiness, subjectivity, political bias, factuality, and authority</article-title>
          , in: J.
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>