<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the CLEF-2023 CheckThat! Lab: Task 2 on Subjectivity in News Articles</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrea Galassi</string-name>
          <email>a.galassi@unibo.it</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Federico Ruggeri</string-name>
          <email>federico.ruggeri6@unibo.it</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Barrón-Cedeño</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Firoj Alam</string-name>
          <email>ifalam@hbku.edu.qa</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tommaso Caselli</string-name>
          <email>t.caselli@rug.nl</email>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mucahid Kutlu</string-name>
          <email>m.kutlu@etu.edu.tr</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julia Maria Struß</string-name>
          <email>julia.struss@fh-potsdam.de</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Antici</string-name>
          <email>francesco.antici@unibo.it</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maram Hasanain</string-name>
          <email>mhasanain@hbku.edu.qa</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juliane Köhler</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Katerina Korre</string-name>
          <email>aikaterini.korre2@unibo.it</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Folkert Leistra</string-name>
          <email>f.a.leistra@student.rug.nl</email>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arianna Muti</string-name>
          <email>arianna.muti2@unibo.it</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Melanie Siegel</string-name>
          <email>melanie.siegel@h-da.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mehmet Deniz Türkmen</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Wiegand</string-name>
          <email>michael.wiegand@aau.at</email>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wajdi Zaghouani</string-name>
          <email>wzaghouani@hbku.edu.qa</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Darmstadt University of Applied Sciences</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Qatar Computing Research Institute</institution>
          ,
          <addr-line>HBKU</addr-line>
          ,
          <country country="QA">Qatar</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>TOBB University of Economics and Technology</institution>
          ,
          <country country="TR">Turkiye</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Applied Sciences Potsdam</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Bologna</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Groningen</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>University of Klagenfurt</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <abstract>
        <p>We describe the outcome of the 2023 edition of the CheckThat!Lab at CLEF. We focus on subjectivity (Task 2), which has been proposed for the first time. It aims at fostering the technology for the identification of subjective text fragments in news articles. For that, we produced corpora consisting of 9,530 manually-annotated sentences, covering six languages -Arabic, Dutch, English, German, Italian, and Turkish. Task 2 attracted 12 teams, which submitted a total of 40 final runs covering all languages. The most successful approaches addressed the task using state-of-the-art multilingual transformer models, which were fine-tuned on language-specific data. Teams also experimented with a rich set of other neural architectures, including foundation models, zero-shot classifiers, and standard transformers, mainly coupled with data augmentation and multilingual training strategies to address class imbalance. We publicly release all the datasets and evaluation scripts, with the purpose of promoting further research on this topic.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The CheckThat!Lab [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] was organized for the 6ℎ time in the framework of CLEF 2023. This
paper presents an overview of Task 2 on the identification of subjectivity in news articles, which
is organized for the first time. 1
      </p>
      <p>In objective sentences, the information is usually presented in a straightforward way. Instead,
subjective sentences often include the use of specific vocabulary, figures of speech, or other
elements that make it more dificult to analyze by machine learning models.</p>
      <p>
        In the context of fact-checking, objective sentences can be directly fed to a fact-checking
pipeline for verification, while subjective ones require an additional processing step, it aims at
extracting a claim or simply discarding its information. Moreover, the presence of subjective
content may be a useful feature that could facilitate downstream tasks in the fact-checking
pipeline, such as political bias [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and factuality reporting [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Given a set of sentences taken from a news article,2 Task 2 requires classifying each of the
sentences as subjective or objective. A sentence is considered subjective if its contents are based
on or influenced by personal feelings, tastes, or opinions. Otherwise, the sentence is considered
objective. The task is ofered in Arabic, Dutch, English, Italian, German, and Turkish, with an
additional multilingual setting.3</p>
      <p>The task attracted 12 participants, for a total of 40 final submissions. Submitted approaches
include large language models, generative models, such as ChatGPT and GPT-3, pre-training
over multilingual data, data augmentation techniques, ensembles of classifiers, and feature
selection. Transformer-based architectures showed to be the most successful, especially when
pre-trained over multilingual data or when considering augmented data. Nonetheless, the task
is not yet solved and there is still room for improvement.</p>
      <p>The remainder of the paper is organized as follows. Section 2 discusses related work. Section 3
describes the multilingual data we produced for the task. Sections 4 overviews the models
proposed by the diferent participants and the results they obtained in the task. Section 5 closes
with final remarks and potential future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Previous studies have explored the contribution of subjectivity detection (SD) technology to
well-known downstream tasks, such as sentiment analysis [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
        ] and bias detection [10, 11].
SD can also influence other tasks such as claim extraction [ 12, 13] and, crucially for our context,
fact-checking [14, 15, 16, 17].
      </p>
      <p>
        The inherent dificulty of providing a practical definition of subjectivity [
        <xref ref-type="bibr" rid="ref8">8, 18</xref>
        ] has led to
several formulations for the task at hand, as it is often influenced by domain-specific assumptions
and a lack of a schematic definition, in particular for data collection. In previous work, corpora for
SD were developed in several diferent ways, such as relying on domain-specific assumptions [ 19,
12, 20, 21, 22] or statistical methods [23, 24]. We instead rely on a prescriptive approach [25], in
1Refer to Alam et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], Da San Martino et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], Nakov et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Haouari et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] to read about Tasks 1, 3, 4, and
5, respectively.
2We note that Turkish dataset utilizes sentences taken from tweets.
3All the data and scripts are publicly available at https://checkthat.gitlab.io/clef2023/task2/.
which SD is conceived as a step that can contribute to tasks such as claim extraction and
factchecking, and data are collected framing the task to domain-specific objectives and proposing
pragmatic annotation.
      </p>
      <p>
        Following Chaturvedi et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], we distinguish between syntactic and semantic solutions
for SD. The first category of approaches relied on keyword spotting [ 19, 12] or lexicons [20,
21, 22, 26]. In contrast, the semantic category encompasses statistical methods [23, 24] or
neural architectures, such as convolutional neural networks [27], deep belief networks [28], and
transformer architectures like BERT [29]. To the best of our knowledge, a systematic approach
for SD leveraging state-of-the-art language models is yet to be proposed.
      </p>
      <p>
        For what concerns language coverage, most studies have focused on English alone. Some
contributions have extended to other languages, such as Arabic [30], German [30], French [26,
30], Italian [29], Romanian [13, 30], and Spanish [30]. Most of these attempts have mainly relied
on machine translation and monolingual ontologies, which inevitably introduce noise when
jumping across languages. In order to produce the datasets for this task, we annotate corpora
in six languages from scratch, relying on native (or near-native) speakers of all languages, and
following a common set of language agnostic guidelines [
        <xref ref-type="bibr" rid="ref10">31</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Datasets</title>
      <sec id="sec-3-1">
        <title>3.1. Arabic</title>
        <p>
          All datasets considered in the task were created following the annotation guidelines presented
in [
          <xref ref-type="bibr" rid="ref10">31</xref>
          ]. In total, about 10k sentences were considered for the task. Table 1 gives statistics of the
corpora in all languages, while Table 2 shows examples of subjective and objective sentences.
Arabic news articles were annotated by three native speakers. The annotators chosen for the
subjectivity annotation task have diverse Arabic-speaking backgrounds, including Egyptian,
Yemeni, and Bahraini. Each annotator is proficient in Modern Standard Arabic (MSA) but brings
their own dialect and unique forms of expression. One annotator has expertise in linguistics and
computational skills, while the other two annotators specialize in the humanities, specifically
digital and political domains.
        </p>
        <p>De nieuwe status van Bonaire, Sint Eustatius en Saba is een stap naar
verbetering.</p>
        <p>Dante slaagde erin om de hel te verruilen voor de relatief milde vlammen
van het Vagevuur.</p>
        <p>While it’s misguided to put all focus or hope onto one section of the working
class, we can’t ignore this immense latent power that logistics workers
possess.</p>
        <p>Workers would have a 24 percent wage increase by 2024, including an
immediate 14 percent raise.</p>
        <p>Für die Pandemie-Macher ist es zugleich von strategischer Bedeutung, die
Kontrollgruppe der Ungeimpften zu eliminieren – und dies möglichst schnell.</p>
        <p>Der andere Angeklagte bekundete, er könne sich an den ganzen Vorgang
nicht erinnern.</p>
        <p>Hanno festeggiato il matrimonio come se non ci fosse il coronavirus.</p>
        <p>Tutti sono stati identificati e multati per aver violato le norme anti contagio
per il contenimento del fenomeno epidemico.</p>
        <p>Kılıçdaroğlu laikliğe aykırı davranmaya devam ediyor: Bu sefer Kur’an öpüp
başına koydu.</p>
        <p>Akşener basına seslendi: Emekçilerin günlük hayatlarını yaşanır hale getirin.</p>
        <p>Class</p>
        <sec id="sec-3-1-1">
          <title>SUBJ</title>
          <p>OBJ</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>SUBJ</title>
          <p>OBJ</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>SUBJ</title>
          <p>OBJ</p>
        </sec>
        <sec id="sec-3-1-4">
          <title>SUBJ</title>
          <p>OBJ</p>
        </sec>
        <sec id="sec-3-1-5">
          <title>SUBJ OBJ</title>
        </sec>
        <sec id="sec-3-1-6">
          <title>SUBJ OBJ</title>
          <p>We ensured annotators’ suitability for the task by selecting university-level annotators
with strong Arabic language background, including research experience or relevant degrees.
They underwent a screening test and received two to three weeks of training, which included
group discussion of annotation tasks, guideline reading, and meetings with the annotation lead
annotator.</p>
          <p>
            Data Collection involved four phases. In Phase 1, Arabic sentences from news articles were
selected, filtered, and parsed. In Phase 2, sentences from the 12 most frequent news domains
were chosen for annotation. Due to labeling skew, Phase 3 focused on selecting sentences with
a higher probability of subjectivity using an SVM classifier. Finally, in Phase 4, the annotated
sentences were reviewed, filtering out uncertain labels, and acquiring the majority label per
sentence for the released dataset.
3.2. Dutch
All the Dutch sentences were sourced from the DPG Media 2019 dataset [
            <xref ref-type="bibr" rid="ref11">32</xref>
            ]4. This dataset
contains partisanship annotations for Dutch newspaper articles based on two methods:
publisherlevel and article-level. For the task, only the article-level annotations were retained as they
were based on the actual content of the article, thus ensuring a higher annotation quality. All
articles annotated as containing some form of partisanship have been gathered and then split
into sentences. Next, a total of 1,500 sentences have been manually annotated by one native
speaker. In order to evaluate the generalizability of the participating systems, articles from
publishers that were not present in the training set have been intentionally kept for testing
purposes.
          </p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.3. English</title>
        <p>
          For English, we use the corpus created by Antici et al. [
          <xref ref-type="bibr" rid="ref10">31</xref>
          ] for the training and development
splits. The corpus was created by annotating sentences from articles on controversial topics
published in popular outlets.5 Six annotators took part in the annotation efort, with each
instance being judged by two of them. Annotators gathered together later on to discuss and
solve disagreements, relying on a seventh annotator to solve conflicts when necessary. We
develop a novel test set following the same procedure, containing 243 sentences that come
from the same news outlets as the other partitions. The Krippendorf’s alpha inter-annotator
agreement (IAA) on the test set was 0.85 (nearly perfect agreement), similar to the 0.83 of the
training and development splits.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.4. German</title>
      </sec>
      <sec id="sec-3-4">
        <title>3.5. Italian</title>
        <p>
          The training, development and test set for German has been assembled by randomly selecting
sentences from the CT!2022FAN-Corpus [
          <xref ref-type="bibr" rid="ref12">33</xref>
          ] consisting of news articles that have been
annotated according to the factuality of their main claim [
          <xref ref-type="bibr" rid="ref13">34</xref>
          ]. Each sentence has been annotated by
two annotators. In total five native speakers were involved in the annotation process. Conflicts
were solved by asking a third annotator for their judgement.
        </p>
        <p>The training and development data for Italian is mostly derived from the SubjectivITA
corpus [29] and consists of about 1,841 sentences. Rather than using the original annotation,
we re-annotated the corpus following our up-to-date guidelines. The re-annotation resulted
in the class-switching of 157 sentences. As for the test set, we released a novel collection of
440 sentences, gathered from popular Italian news outlets.6 The annotation follows the same
methodology used for the English dataset, involving five annotators plus one to solve conflicts.
The IAA score on this novel test set is 0.91, which corresponds to nearly perfect agreement.
4https://github.com/dpgmedia/partisan-news2019
5The outlets are tribunemag.co.uk, spectator.co.uk, shtfplan.com, vdare.com, theweek.com, frontpagemag.com,
economist.com, and theguardian.com.
6We considered the following websites: corriere.it, avantionline.it, ilpost.it, avvenire.it, repubblica.it,
ilfattoquotidiano.it, ilgiornale.it, ansa.it, ilfoglio.it, liberoquotidiano.it.
3.6. Turkish
In order to construct the Turkish dataset, instead of using news articles to extract sentences, we
utilized sentences in tweets. In particular, we first crawled Turkish tweets tracking keywords
about politics. Subsequently, we removed similar tweets and then selected candidate tweets to be
annotated. As judging tweets with incomplete and/or multiple sentences would be problematic,
we manually selected a single sentence from each tweet and discarded the unsuitable ones. Two
annotators first judged each sentence independently. Subsequently, they discussed with each
other to reach an agreement on the sentences they disagreed on. We discarded the sentences on
which the two annotators disagreed even after their discussion.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.7. Multilingual</title>
        <p>The multilingual dataset is composed of sentences sampled from the other datasets. For training,
we proposed to use data from other datasets. We proposed a development set resulting from the
aggregation of 50 subjective and 50 objective sentences randomly sampled from the respective
development set in all the languages. The same procedure was followed for the test set, using
the test sets from the other languages. While some teams followed our partition for training
and development, others preferred to create their own splits.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Overview of the Systems and Results</title>
      <p>Task 2 is formally defined as follows. Given a sentence , extracted either from a news article or
from a tweet (as in the case of Turkish), determine whether  is influenced by the subjective
view of its author (class SUBJ) or presents an objective view of the covered topic (class OBJ).</p>
      <p>A total of 12 teams participated to this task, with most teams targeting more than one
language, be it with the same or with diferent approaches. The participants experimented with
multiple models from the BERT family, as well as with generative models.</p>
      <p>Table 3 ofers a snapshot of the approaches, whereas Table 4 reports the performance results
for all submissions, ranked on the basis of macro-averaged F1.7</p>
      <p>
        For the baselines, we implemented a logistic regressor trained on a multilingual
SentenceBERT [
        <xref ref-type="bibr" rid="ref24">45</xref>
        ] representation of the data. For each language, we trained the baseline on the
respective training data alone.
      </p>
      <p>
        Five of the teams experimented with the use of generative pre-trained transformers (GPT) [
        <xref ref-type="bibr" rid="ref15 ref17 ref22">36,
38, 43</xref>
        ] with diferent levels of success. Team DWReCo [
        <xref ref-type="bibr" rid="ref15">36</xref>
        ] obtained the top performance
in English and first runner-up in Turkish by using GPT-3 to reduce class imbalance with
propaganda style-based data augmentation. The styles are identified from the journalistic
checklist to identify subjective news. Team Fraunhofer SIT [
        <xref ref-type="bibr" rid="ref17">38</xref>
        ] used GPT-3 as well, but their
few-shot classification barely aforded to bit the baseline. Team TUDublin [
        <xref ref-type="bibr" rid="ref22">43</xref>
        ] also performed
data augmentation, this time using ChatGPT. Still, their classification model, built on top of
M-BERT did not improve over the baseline in any of the languages they participated in. Team
7We decided to use the macro-averaged F1 rather than the F1 on the positive classes, as usually done in binary tasks,
to overcome its limitation in contexts where the distribution of the classes is heavily imbalanced [
        <xref ref-type="bibr" rid="ref23">44</xref>
        ].
reanm lisgnh itan iskh TR EB aTR aT PT iittsegoonnB lliiii-ttraaguunnn ittteaaaagounnm littrceeeSoun leebm
      </p>
      <p>G E lIa ruT EB aoTRR oLEXBRM iagETBRG -ETBRM -eEBRDM -SETBR itteSF tahCG -3PTG TBRA SLTM radG lM D eaF sEn
TOBB ETU employed ChatGPT to directly classify the texts, experimenting with both
zeroand few-shot classification.</p>
      <p>
        The second best performance in English was obtained by team Gpachov [
        <xref ref-type="bibr" rid="ref18">39</xref>
        ], who used an
ensemble of three distinct models: XLM-RoBERTa, Sentence BERT (S-BERT), and SetFit.
      </p>
      <p>
        As observed in previous years, paying attention to all the languages paid back again. Team
Thesis Titan [
        <xref ref-type="bibr" rid="ref20">41</xref>
        ] developed multiple fine-tuned models using mDeBERTaV3-base [
        <xref ref-type="bibr" rid="ref25">46</xref>
        ],
starting from a newly developed multilingual dataset. While keeping the training data fixed, they
used language-specific validation sets to optimize the models for each language, as well as to
identify optimal language-specific hyperparameters. This approach resulted in the top
performance in Dutch, German, Italian and Turkish, being the second best in Arabic and the
multilingual setting. Team NN [
        <xref ref-type="bibr" rid="ref19">40</xref>
        ] relied on the multilingual XLM-RoBERTa. This approach
resulted in the top performance in the multilingual setting, as well as in Arabic, with a top-3
performance in Durch, German, Italian, and Turkish.
      </p>
      <p>The baseline logistic regressors are competitive in all settings, obtaining a score of at least 0.64
and often bitting participant approaches. At least half of the submissions surpassed the baseline,
with a diference with the best approaches that range from 18 percentage points (German) to 6
percentage points (English).</p>
      <p>Next we visit the landscape for the multilingual and for each language setting.
- Run submitted after the deadline.
† Team involved in the preparation of the data.</p>
      <p>* No working note submitted.</p>
      <p>
        Multilingual. Five teams submitted runs to the multilingual setting. NN [
        <xref ref-type="bibr" rid="ref19">40</xref>
        ] obtained the
ifrst position with their approach based on XLM-RoBERTa. It is interesting to notice that, in the
monolingual settings, team Thesis Titan obtained a better score than team NN in 5 out of 6
languages, and than team tarrekko in 4 out of 6 languages. Nevertheless, their approach is not
cross-language, since each a model is built independently for each setting.
      </p>
      <p>Arabic. Five teams submitted their results, with team NN obtaining the best result of 0.79.
The three best approaches obtained a similar score (from 0.78 to 0.79), largely surpassing the
baseline score of 0.66.</p>
      <p>Dutch. Five teams participated, with Thesis Titan obtaining the best result of 0.81, surpassing
the baseline by about 15 percentage points.</p>
      <p>
        English. Out of 11 submissions, only seven surpassed the approach, with team DWReCo [
        <xref ref-type="bibr" rid="ref15">36</xref>
        ]
obtaining the first place. The four best approaches achieved similar scores, falling within the
range of [0.77,0.78]. Similarly, the three approaches ranked between fourth and sixth position
obtained a comparable result, with a score of approximately 0.73, which is slightly higher than
the baseline.
      </p>
      <p>German. We received seven submissions, and Thesis Titan achieved the highest result of
0.82, surpassing the baseline by 18 percentage points.</p>
      <p>Italian. Six teams participated, and Thesis Titan achieved the highest result of 0.76,
surpassing the second-best result from team NN by 0.05 points and the baseline by 0.12 points.
Turkish. Team Thesis Titan obtained the highest results among the six submissions,
outperforming the baseline by approximately 13 percentage points.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>We have presented a detailed overview of Task 2 of the CheckThat! Lab of CLEF 2023. The lab
was focused on the detection of subjectivity in sentences extracted from news articles. Following
the objectives of CLEF, we ofered the task in six diferent languages and in a multilingual
setting, thus fostering multilinguality.</p>
      <p>Most of the submissions focused on the use of pre-trained models, some exploited more recent
dialogue-based technologies such as ChatGPT. The most successful approaches incorporated
additional knowledge in their model through multilingual pre-training of the models or data
augmentation. The best macro F1 scores ranged between 0.75 to 0.82, showing that the task can
be successfully addressed but is not yet completely solved.</p>
      <p>Future work will be centered on extending high-quality multilingual datasets and broadening
the scope by including document-level classification settings.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>We are thankful to the volunteers that helped with the annotation of the data such as A. Bardi, A.
Fedotova, and K. Ebermanns. The work of A. Galassi is supported by the European Commission
NextGeneration EU programme, PNRR-M4C2-Investimento 1.3, PE00000013-“FAIR” - Spoke
8. A. Muti is supported by the program Progetti di formazione per la ricerca: Big Data per una
regione europea più ecologica, digitale e resiliente—Alma Mater Studiorum–Università di Bologna,
Ref. 2021-15854. K. Korre is supported by the PON programme FSE REACT-EU, Ref. DOT1303118.
The work related to the Arabic language was partially made possible by NPRP grant
NPRP13S0206-200281 and NPRP 14C-0916-210015 from the Qatar National Research Fund (a member
of Qatar Foundation). The work related to the Turkish language was funded by the Scientific
and Technological Research Council of Turkey (TUBITAK) ARDEB 3501 Grant No 120E514.
The work related to the German data has partially been funded by the BMBF (German Federal
Ministry of Education and Research) under the grant no. 01FP20031J. The responsibility for the
contents of this publication lies with the authors. The findings achieved herein are solely the
responsibility of the authors.
[10] D. Aleksandrova, F. Lareau, P. A. Ménard, Multilingual sentence-level bias detection in
Wikipedia, in: Proceedings of the International Conference on Recent Advances in Natural
Language Processing (RANLP 2019), INCOMA Ltd., Varna, Bulgaria, 2019, pp. 42–51. URL:
https://aclanthology.org/R19-1006. doi:10.26615/978-954-452-056-4_006.
[11] C. Hube, B. Fetahu, Neural based statement classification for biased language, in: J. S.</p>
      <p>Culpepper, A. Mofat, P. N. Bennett, K. Lerman (Eds.), Proceedings of the Twelfth ACM
International Conference on Web Search and Data Mining, WSDM 2019, Melbourne, VIC,
Australia, February 11-15, 2019, ACM, 2019, pp. 195–203. URL: https://doi.org/10.1145/
3289600.3291018. doi:10.1145/3289600.3291018.
[12] E. Rilof, J. Wiebe, Learning extraction patterns for subjective expressions, in: Proceedings
of the 2003 Conference on Empirical Methods in Natural Language Processing, 2003, pp.
105–112. URL: https://aclanthology.org/W03-1014.
[13] C. Banea, R. Mihalcea, J. Wiebe, Sense-level subjectivity in a multilingual setting, Comput.</p>
      <p>Speech Lang. 28 (2014) 7–19. URL: https://doi.org/10.1016/j.csl.2013.03.002. doi:10.1016/
j.csl.2013.03.002.
[14] L. L. Vieira, C. L. M. Jerônimo, C. E. C. Campelo, L. B. Marinho, Analysis of the subjectivity
level in fake news fragments, in: C. de Salles Soares Neto (Ed.), WebMedia ’20: Brazillian
Symposium on Multimedia and the Web, São Luís, Brazil, November 30 - December 4, 2020,
ACM, 2020, pp. 233–240. URL: https://doi.org/10.1145/3428658.3430978. doi:10.1145/
3428658.3430978.
[15] C. L. M. Jerônimo, L. B. Marinho, C. E. C. Campelo, A. Veloso, A. S. da Costa Melo, Fake
news classification based on subjective language, in: Proceedings of the 21st International
Conference on Information Integration and Web-based Applications &amp; Services, iiWAS
2019, Munich, Germany, December 2-4, 2019, ACM, 2019, pp. 15–24. URL: https://doi.org/
10.1145/3366030.3366039. doi:10.1145/3366030.3366039.
[16] F. Alam, S. Shaar, F. Dalvi, H. Sajjad, A. Nikolov, H. Mubarak, G. D. S. Martino, A. Abdelali,
N. Durrani, K. Darwish, A. Al-Homaid, W. Zaghouani, T. Caselli, G. Danoe, F. Stolk,
B. Bruntink, P. Nakov, Fighting the COVID-19 infodemic: Modeling the perspective of
journalists, fact-checkers, social media platforms, policy makers, and the society, in:
Findings of EMNLP 2021, 2021, pp. 611–649.
[17] P. Nakov, F. Alam, S. Shaar, G. Da San Martino, Y. Zhang, COVID-19 in Bulgarian
social media: Factuality, harmfulness, propaganda, and framing, in: Proceedings of the
International Conference on Recent Advances in Natural Language Processing (RANLP
2021), INCOMA Ltd., Held Online, 2021, pp. 997–1009. URL: https://aclanthology.org/2021.
ranlp-1.113.
[18] T. Wilson, J. Wiebe, Annotating opinions in the world press, in: Proceedings of the
SIGDIAL 2003 Workshop, The 4th Annual Meeting of the Special Interest Group on
Discourse and Dialogue, July 5-6, 2003, Sapporo, Japan, The Association for Computer
Linguistics, 2003, pp. 13–22. URL: https://aclanthology.org/W03-2102/.
[19] J. Wiebe, E. Rilof, Creating subjective and objective sentence classifiers from unannotated
texts, in: A. Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing,
Springer Berlin Heidelberg, Berlin, Heidelberg, 2005, pp. 486–497.
[20] N. Das, S. Sagnika, A subjectivity detection-based approach to sentiment analysis, in:
D. Swain, P. K. Pattnaik, P. K. Gupta (Eds.), Machine Learning and Information Processing,
Springer Singapore, Singapore, 2020, pp. 149–160.
[21] H. Yu, V. Hatzivassiloglou, Towards answering opinion questions: Separating facts from
opinions and identifying the polarity of opinion sentences, in: Proceedings of the 2003
Conference on Empirical Methods in Natural Language Processing, EMNLP ’03, Association
for Computational Linguistics, USA, 2003, p. 129–136. URL: https://doi.org/10.3115/1119355.
1119372. doi:10.3115/1119355.1119372.
[22] J. Villena-Román, J. García-Morera, M. Á. G. Cumbreras, E. Martínez-Cámara, M. T.
MartínValdivia, L. A. U. López, Overview of TASS 2015, in: J. Villena-Román, J. García-Morera,
M. Á. G. Cumbreras, E. Martínez-Cámara, M. T. Martín-Valdivia, L. A. U. López (Eds.),
Proceedings of TASS 2015: Workshop on Sentiment Analysis at SEPLN co-located with
31st SEPLN Conference (SEPLN 2015), Alicante, Spain, September 15, 2015, volume 1397
of CEUR Workshop Proceedings, CEUR-WS.org, 2015, pp. 13–21. URL: http://ceur-ws.org/
Vol-1397/overview.pdf.
[23] B. Pang, L. Lee, A sentimental education: Sentiment analysis using subjectivity
summarization based on minimum cuts, in: Proceedings of the 42nd Annual Meeting of the
Association for Computational Linguistics (ACL-04), Barcelona, Spain, 2004, pp. 271–278.</p>
      <p>URL: https://aclanthology.org/P04-1035. doi:10.3115/1218955.1218990.
[24] F. Sha, F. C. N. Pereira, Shallow parsing with conditional random fields, in: M. A. Hearst,
M. Ostendorf (Eds.), Human Language Technology Conference of the North American
Chapter of the Association for Computational Linguistics, HLT-NAACL 2003, Edmonton,
Canada, May 27 - June 1, 2003, The Association for Computational Linguistics, 2003, pp.
213–220. URL: https://aclanthology.org/N03-1028/.
[25] F. Ruggeri, F. Antici, A. Galassi, K. Korre, A. Muti, A. Barrón-Cedeño, On the definition
of prescriptive annotation guidelines for language-agnostic subjectivity detection, in:
R. Campos, A. M. Jorge, A. Jatowt, S. Bhatia, M. Litvak (Eds.), Text2Story@ECIR, volume
3370 of CEUR Workshop Proceedings, CEUR-WS.org, 2023, pp. 103–111. URL: https://ceur-ws.
org/Vol-3370/paper10.pdf.
[26] F. Benamara, B. Chardon, Y. Mathieu, V. Popescu, Towards context-based subjectivity
analysis, in: Proceedings of 5th International Joint Conference on Natural Language
Processing, Asian Federation of Natural Language Processing, Chiang Mai, Thailand, 2011,
pp. 1180–1188. URL: https://aclanthology.org/I11-1132.
[27] N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional neural network for
modelling sentences, in: Proceedings of the 52nd Annual Meeting of the Association for
Computational Linguistics, ACL 2014, June 22-27, 2014, Baltimore, MD, USA, Volume
1: Long Papers, The Association for Computer Linguistics, 2014, pp. 655–665. URL:
https://doi.org/10.3115/v1/p14-1062. doi:10.3115/v1/p14-1062.
[28] I. Chaturvedi, Y. Ong, I. Tsang, R. Welsch, E. Cambria, Learning word dependencies in
text by means of a deep recurrent belief network, Knowledge-Based Systems 108 (2016).
doi:10.1016/j.knosys.2016.07.019.
[29] F. Antici, L. Bolognini, M. A. Inajetovic, B. Ivasiuk, A. Galassi, F. Ruggeri, Subjectivita: An
italian corpus for subjectivity detection in newspapers, in: CLEF, volume 12880 of Lecture
Notes in Computer Science, Springer, 2021, pp. 40–52.
[30] C. Banea, R. Mihalcea, J. Wiebe, Multilingual subjectivity: Are more languages better?, in:
Proceedings of the 23rd International Conference on Computational Linguistics (Coling</p>
    </sec>
    <sec id="sec-7">
      <title>A. Approaches Summary per Participant</title>
      <p>
        Accenture [
        <xref ref-type="bibr" rid="ref14">35</xref>
        ] employed several language-specific pre-trained language models which
were fine-tuned to the downstream task. They also applied data augmentation through
backtranslation to dim the class imbalance in the training datasets.
      </p>
      <p>
        Awakened implemented an ensemble of classifiers including a BiLSTM and BART. The
embedding produced by each models is concatenated and fed to a final classification layer.
DWReCo [
        <xref ref-type="bibr" rid="ref15">36</xref>
        ] experimented with propaganda style-based data augmentation via GPT-3
to address class imbalance. The styles are identified from the journalistic checklist to identify
subjective news.
      </p>
      <p>
        ES-VRAI [
        <xref ref-type="bibr" rid="ref16">37</xref>
        ] used M-BERT and made use of oversampling techniques to address class
imbalance.
      </p>
      <p>
        Fraunhofer SIT [
        <xref ref-type="bibr" rid="ref17">38</xref>
        ] employed GPT-3 for few-shot classification.
      </p>
      <p>
        Gpachov [
        <xref ref-type="bibr" rid="ref18">39</xref>
        ] applied an ensemble of three distinct models: XLM-RoBERTa, Sentence BERT
(S-BERT), and SetFit.
      </p>
      <p>KUCST used a Gradient Boosting classifier with BERT-based encoding and a subset of carefully
selected features as inputs.</p>
      <p>
        NN [
        <xref ref-type="bibr" rid="ref19">40</xref>
        ] used the multilingual XLM-RoBERTa, fine-tuned on 100 languages from 2.5TB of
ifltered CommonCrawl data.
tarrekko did not provide additional information.
      </p>
      <p>
        Thesis Titan [
        <xref ref-type="bibr" rid="ref20">41</xref>
        ] developed multiple fine-tuned models using mDeBERTaV3-base [
        <xref ref-type="bibr" rid="ref25">46</xref>
        ]
starting from a newly developed multilingual dataset. While keeping the training data fixed,
they have used language-specific validation sets to optimize the models for each language, as
well as to identify optimal language-specific hyperparameters.
      </p>
      <p>
        TOBB ETU [
        <xref ref-type="bibr" rid="ref21">42</xref>
        ] employed ChatGPT to classify the texts. They explored zero-shot and
fewshot classification. In the former, they show a few mistakes of zero-shot classification method
on the training set.
      </p>
      <p>
        TUDublin [
        <xref ref-type="bibr" rid="ref22">43</xref>
        ] experimented with M-BERT and made use of ChatGPT to perform data
augmentation of the available training datasets.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          , G. Da San Martino, T. Elsayed,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Nandi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Cheema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Azizov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <surname>The</surname>
            <given-names>CLEF</given-names>
          </string-name>
          -2023
          <string-name>
            <surname>CheckThat!</surname>
          </string-name>
          <article-title>Lab: Checkworthiness, subjectivity, political bias, factuality, and authority</article-title>
          , in: J.
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <string-name>
            <surname>Kruschwitz</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Caputo (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer Nature Switzerland, Cham,
          <year>2023</year>
          , pp.
          <fpage>506</fpage>
          -
          <lpage>517</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galassi</surname>
          </string-name>
          , G. Da San Martino, P. Nakov, ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Azizov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Cheema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kutlu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          , W. Zaghouani,
          <article-title>Overview of the CLEF-2023 CheckThat! Lab checkworthiness, subjectivity, political bias, factuality, and authority of news articles and their source</article-title>
          , in: A.
          <string-name>
            <surname>Arampatzis</surname>
            , E. Kanoulas,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Tsikrika</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Vrochidis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Giachanou</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Aliannejadi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Vlachos</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF</source>
          <year>2023</year>
          ),
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Cheema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hakimov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Míguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mubarak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF-2023 CheckThat! lab task 1 on check-worthiness in multimodal and multigenre content</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Da San Martino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Nandi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Azizov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF-2023 CheckThat! lab task 3 on political bias of news articles and news media</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          , G. Da San Martino, M. Hasanain,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Nandi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Azizov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Panayotov</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF-2023 CheckThat! lab task 4 on factuality of reporting of news media</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. Sheikh</given-names>
            <surname>Ali</surname>
          </string-name>
          , T. Elsayed,
          <article-title>Overview of the CLEF-2023 CheckThat! lab task 5 on authority finding in twitter</article-title>
          , in: [47],
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Stepinski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. O.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <article-title>A fact/opinion classifier for news articles</article-title>
          , in: W. Kraaij,
          <string-name>
            <surname>A. P. de Vries</surname>
            ,
            <given-names>C. L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Clarke</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Fuhr</surname>
          </string-name>
          , N. Kando (Eds.),
          <source>SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , Amsterdam, The Netherlands,
          <source>July 23-27</source>
          ,
          <year>2007</year>
          , ACM,
          <year>2007</year>
          , pp.
          <fpage>807</fpage>
          -
          <lpage>808</lpage>
          . URL: https://doi.org/10.1145/1277741.1277919. doi:
          <volume>10</volume>
          .1145/1277741.1277919.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Chaturvedi</surname>
          </string-name>
          , E. Cambria,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Welsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Herrera</surname>
          </string-name>
          ,
          <article-title>Distinguishing between facts and opinions for sentiment analysis: Survey and challenges</article-title>
          ,
          <source>Inf. Fusion</source>
          <volume>44</volume>
          (
          <year>2018</year>
          )
          <fpage>65</fpage>
          -
          <lpage>77</lpage>
          . URL: https://doi.org/10.1016/j.infus.
          <year>2017</year>
          .
          <volume>12</volume>
          .006. doi:
          <volume>10</volume>
          .1016/j.inffus.
          <year>2017</year>
          .
          <volume>12</volume>
          .006.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Clematide</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gindl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Klenner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Petrakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Remus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ruppenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Waltinger</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Wiegand, MLSA - a multi-layered reference corpus for German sentiment analysis</article-title>
          ,
          <source>in: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)</source>
          ,
          <source>European Language Resources Association (ELRA)</source>
          , Istanbul, Turkey,
          <year>2012</year>
          , pp.
          <fpage>3551</fpage>
          -
          <lpage>3556</lpage>
          . URL: http://www.lrec-conf.org/proceedings/lrec2012/pdf/125_Paper.pdf.
          <year>2010</year>
          ),
          <article-title>Coling 2010 Organizing Committee</article-title>
          , Beijing, China,
          <year>2010</year>
          , pp.
          <fpage>28</fpage>
          -
          <lpage>36</lpage>
          . URL: https: //aclanthology.org/C10-1004.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>F.</given-names>
            <surname>Antici</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Korre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Muti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fedotova</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>BarrónCedeño, A corpus for sentence-level subjectivity detection on english news articles</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2305</volume>
          .
          <fpage>18034</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [32]
          <string-name>
            <surname>C.-L. Yeh</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Loni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Hendriks</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Reinhardt</surname>
            ,
            <given-names>A. Schuth,</given-names>
          </string-name>
          <article-title>Dpgmedia2019: A dutch news dataset for partisanship detection</article-title>
          ,
          <year>2019</year>
          . arXiv:arXiv:
          <year>1908</year>
          .02322.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Köhler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Siegel, CT-FAN: A Multilingual dataset for Fake News Detection</article-title>
          ,
          <year>2022</year>
          . URL: https://doi.org/10.5281/zenodo.6555293. doi:
          <volume>10</volume>
          .5281/zenodo.6555293.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>J.</given-names>
            <surname>Köhler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Schütz, Overview of the CLEF-2022 CheckThat! lab task 3 on fake news detection</article-title>
          , in: Working Notes of CLEF 2022-
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          , CLEF '
          <year>2022</year>
          , Bologna, Italy,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rodrigues</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Strauss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Williams</surname>
          </string-name>
          , Accenture at CheckThat! 2023:
          <article-title>Impacts of back-translation on subjectivity detection</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>I. B.</given-names>
            <surname>Schlicht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Khellaf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Altiok</surname>
          </string-name>
          , Dwreco at CheckThat! 2023:
          <article-title>Enhancing subjectivity detection through style-based data sampling</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>H. T.</given-names>
            <surname>Sadouk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Sebbak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. E.</given-names>
            <surname>Zekiri</surname>
          </string-name>
          , Es-vrai at CheckThat! 2023:
          <article-title>Enhancing model performance for subjectivity detection through multilingual data aggregation</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Frick</surname>
          </string-name>
          , Fraunhofer sit at CheckThat! 2023:
          <article-title>Can llms be used for data augmentation &amp; few-shot classification? detecting subjectivity in text using chatgpt</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>G.</given-names>
            <surname>Pachov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dimitrov</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Koychev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          , Gpachov at CheckThat!
          <year>2023</year>
          :
          <article-title>A diverse multi-approach ensemble for subjectivity detection in news articles</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>K.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tarannum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R. H.</given-names>
            <surname>Noori</surname>
          </string-name>
          , Nn at CheckThat! 2023:
          <article-title>Subjectivity in news articles classification with transformer based models</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>F.</given-names>
            <surname>Leistra</surname>
          </string-name>
          , T. Caselli, Thesis titan at CheckThat! 2023:
          <article-title>Language-specific fine-tuning of mdebertav3 for subjectivity detection</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>M.</given-names>
            <surname>Deniz Türkmen</surname>
          </string-name>
          , G. Coşgun,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kutlu</surname>
          </string-name>
          , TOBB ETU at CheckThat! 2023:
          <article-title>Utilizing chatgpt to detect subjective statements and political bias</article-title>
          , in: [47],
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>E.</given-names>
            <surname>Shushkevich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cardif</surname>
          </string-name>
          , Tudublin at CheckThat! 2023:
          <article-title>Chatgpt for data augmentation</article-title>
          ,
          <source>in: [47]</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>D.</given-names>
            <surname>Chicco</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Jurman,</surname>
          </string-name>
          <article-title>The advantages of the matthews correlation coeficient (mcc) over f1 score and accuracy in binary classification evaluation</article-title>
          ,
          <source>BMC genomics 21</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          , Sentence-BERT:
          <article-title>Sentence embeddings using Siamese BERTnetworks</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>3982</fpage>
          -
          <lpage>3992</lpage>
          . URL: https://aclanthology.org/D19-1410. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          -1410.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>P.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , W. Chen,
          <article-title>Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2111</volume>
          .
          <fpage>09543</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>M.</given-names>
            <surname>Aliannejadi</surname>
          </string-name>
          , G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          , Vlachos, Michalis (Eds.), Working Notes of CLEF 2023 -
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          ,
          <string-name>
            <surname>CLEF</surname>
          </string-name>
          <year>2023</year>
          , Thessaloniki, Greece,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>