<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>M. Heil);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>DS@GT at CheckThat! 2025: Detecting Subjectivity via Transfer-Learning and Corrective Data Augmentation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maximilian Heil</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dionne Bang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Georgia Institute of Technology</institution>
          ,
          <addr-line>North Ave NW, Atlanta, GA 30332</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>This paper presents our submission to Task 1, Subjectivity Detection, of the CheckThat! Lab at CLEF 2025. We investigate the efectiveness of transfer-learning and stylistic data augmentation to improve classification of subjective and objective sentences in English news text. Our approach contrasts fine-tuning of pre-trained encoders and transfer-learning of fine-tuned transformer on related tasks. We also introduce a controlled augmentation pipeline using GPT-4o to generate paraphrases in predefined subjectivity styles. To ensure label and style consistency, we employ the same model to correct and refine the generated samples. Results show that transfer-learning of specified encoders outperforms fine-tuning general-purpose ones, and that carefully curated augmentation significantly enhances model robustness, especially in detecting subjective content. Our oficial submission placed us 16ℎ of 24 participants. Overall, our findings underscore the value of combining encoder specialization with label-consistent augmentation for improved subjectivity detection. Our code is available at https://github.com/dsgt-arc/checkthat-2025-subject.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Subjectivity Detection</kwd>
        <kwd>Transfer Learning</kwd>
        <kwd>Transformer</kwd>
        <kwd>Data Generation</kwd>
        <kwd>GPT</kwd>
        <kwd>Fine Tuning</kwd>
        <kwd>CEUR-WS</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Given the great risk of misinformation globally[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the need for automatic fact-check systems is vital.
Similar to machine learning pipelines, an automatic fact-checking system also includes more than just a
classifier: retrieval to equip the system with evidence for the fact-check, data preparation to comply with
the format requirements of the system, training or fine-tuning to enhance classification performance,
and much more. For example, an objective sentence can directly be fact-checked, a subjective sentence
needs further data augmentation before it can be passed forward into a fact-checking system. The
subjective sentence must be stripped from emotions, opinions or personal interpretations so that
the fact-checking system can subsequently focus on the factual verification only. This motivates
the CheckThat! Lab of CLEF 2025[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], where Task 1 focuses on identifying subjective and objective
sentences in news paper articles[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Task 1 of CheckThat! has evolved over recent years to address
subjectivity detection in multilingual and monolingual contexts. Previous editions in 2023 and 2024
have established strong baselines using transformer-based models and explored both traditional and
generative approaches [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. Participating teams have applied lexicon-based classifiers, fine-tuned
encoders, and increasingly, synthetic data generation to boost performance under limited data settings.
In this paper, we present our contribution to the 2025 English monolingual task. Our approach
explores three key research areas focusing on transfer-learning, data-augmentation and the ability of a
generative model to correct and refine itself (self-correction). We evaluate both general pre-trained
encoders and compare them with encoders that have already been fine-tuned on related tasks (specified
encoders). In addition, we investigate the role of data augmentation through stylistic paraphrasing via
a large language model (LLM). Furthermore, we introduce a correction pipeline using the same LLM to
align generated paraphrases with their intended labels and stylistic attributes. The impact of each
component is assessed through detailed ablation experiments. Overall, our approach to the competition
ranks us 16 out of 24.
      </p>
      <p>The paper is structured as follows: Section 2 highlights related work, Section 3 presents our
methodology, Section 4 describes the dataset, Section 5 and Section 6 shows and discusses the results,
Section 7 highlights future research avenues, and Section 8 concludes.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Subjectivity detection has a long history across contexts[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], domains[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and languages[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. More
specifically, subjectivity detection in news paper articles has a three-year history with CheckThat! at
CLEF. Most of the results [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ] have been driven via the advent of transformer architectures [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and the
introduction of BERT-like encoder models[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for natural language processing (NLP). NLP systems now
easily expand across domains or languages with high accuracy and robustness. In addition, generative
models have made substantial contributions due to their outstanding zero-shot and few-shot capability,
as well as a significant long context window[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        Last year’s winner of the monolingual English task, Team DWReCo, employed an LLM to generate
subjective training examples with subjectivity styles (e.g. partisan, exaggerated, emotional) given
expert knowledge. Synthetic data has been used to balance the data set and enrich the fine-tuning
of a RoBERTa-base[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] model with a classification head.[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] Our data augmentation approach is
inlfuenced by Team DWReCo, but we are less interested in data sampling. In contrast, we use stylistic
data augmentation for contrastive learning[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The 2023 monolingual English task winner, Team
HYBRINFOX, builds an ensemble of a fine-tuned RoBERTa-base and a DistilBERT-base-nli-mean-tokens
(sentence transformer[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]) to capture the synthetic as well as semantic meaning of the sentences. This is
completed with a self-designed expert system that classifies given NER and lexicon-based methods[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
Similarly, we also fine-tune general-purpose encoders on the data set and, in addition, investigate the
capability of transfer learning in subjectivity detection.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>In the course of the competition, we explored the potential of specialized encoders and data augmentation.
As shown in Table 1, we contrasted the fine-tuning of general-purpose encoders with transfer-learning
of specialized encoders, fine-tuning both on the original data set. In a second step, we explored data
augmentation and investigated its added benefit. Finally, we also added a self-corrective data alignment
procedure to the data augmentation to ensure that generated paraphrases match their intended labels
and styles, using GPT-4o to identify and rewrite those it considers inconsistent.</p>
      <p>
        First, our approach contrasted general-purpose encoders and specialized encoders by evaluating
their respective capabilities to distinguish objective from subjective newspaper sentences. Initially,
we assessed the performance of general-purpose encoders including RoBERTa-base, MiniLM-L12-v2,
and ModernBERT [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]). This evaluation focused on comparing their capabilities in token-level and
sentence-level semantic relationships, as well as contextual understanding. We then employed the
transfer-learning capabilities of encoders Sentiment-Analysis-BERT [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ],
Emotion-English-DistilroBERTabase, and Emotion-English-RoBERTa-large [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], which were already fine-tuned on related tasks with
domain-specific datasets for sentiment analysis and emotion recognition. These models are better
equipped to detect emotional tone and subjective language, improving their ability to distinguish
between subjective and objective statements.
      </p>
      <p>Hypothesis H1: Transfer-learning with specialized encoders will result in greater sensitivity
in distinguishing between subjective and objective language expressions.</p>
      <p>
        Second, we investigated the added benefit of data augmentation. Given the small size of the original
dataset, we pursued a strategy to synthetically augment and expand the training data using GPT-4o[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
Inspired by ClaimDecomp[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], which decomposed complex political claims into literal and implied
sub-questions to improve fact verification, we hypothesized that generating stylistic paraphrases of
labeled sentences could improve classification performance by increasing training diversity.
      </p>
      <p>
        ClaimDecomp evaluated sub-question generation using encoder-based language models but found
that while literal questions were tractable, implied ones remained challenging due to the models’ limited
reasoning and contextual capabilities. They also noted that larger language models like GPT could be
more suitable for handling these implicit inferences. In parallel, the CLEF CheckThat! 2024 overview[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
found that transformer-based classifiers augmented with domain-informed synthetic data outperformed
baselines and that models consistently struggled more with identifying subjective sentences across
languages.
      </p>
      <p>
        Drawing on these findings, we adopted a generation approach similar in spirit to CLaC-2[ 22], who
used zero-shot GPT-3 to generate two paraphrases per sentence and labeled them via majority vote.
However, rather than perform on-the-fly classification, we used GPT-4o in a few-shot setup to generate
paraphrases with controlled styles, both subjective and objective, based on the original sentence content.
Unlike DWReCo[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], who used a zero-shot prompt method to generate new subjective examples based
on a subjectivity style checklist, we generated both subjective and objective sentences to augment
each data point. DWReCo also found that augmentation with paraphrased sentences produced
lowerdiversity samples and required post-hoc filtering. We did not apply such filtering in our submitted
results due to time constraints.
      </p>
      <p>Our generation approach was guided by the hypothesis that providing both subjective and objective
perspectives per original content could improve a model’s ability to distinguish stylistic cues in spirit
of contrastive learning. For each subjective sentence, we generated two or six objective paraphrases.
Conversely, for each objective sentence, we generated two or six subjective variants, explicitly styled
as propaganda, exaggerated, emotional, derogatory, partisan, or prejudiced (categories aligned with
DWReCo’s most efective style prompts).</p>
      <p>• Original (SUBJ): “Gone are the days when they led the world in recession-busting.”
• Generated (OBJ): “The era in which they were at the forefront of overcoming economic
downturns has ended.”
• Original (OBJ): “The trend is expected to reverse as soon as next month.”
• Generated (SUBJ): “A promising turnaround is on the horizon, with expectations for change as
early as next month.”</p>
      <sec id="sec-3-1">
        <title>This resulted in two augmented datasets:</title>
        <p>• Balanced-2: Each sentence augmented with 2 paraphrases in the opposite style
• Balanced-6: Each sentence augmented with 6 paraphrases covering a wider stylistic range
To assess the impact of this augmentation, we selected two encoder models for fine-tuning:
Sentiment-Analysis-BERT and Emotion-English-DistilRoBERTa-base. These were chosen based on
their stronger baseline performance compared to general-purpose encoders, which underperformed
consistently in earlier trials and was excluded from further experiments.</p>
        <p>Hypothesis H2: Fine-tuning with augmented train data will improve the classification
performance over the original data set.</p>
        <p>After submission, in attempt to further improve the quality of our augmented datasets, we developed
a second-stage validation and correction pipeline using GPT-4o. Our goal was to ensure that each
generated sentence not only aligned with its assigned label (SUBJ or OBJ), but also reflected the
appropriate stylistic intent. After visual inspection of the generated samples we recognized that some
synthetic data samples were misleading and could degrade the performance when used in fine-tuning the
classifier. Therefore, we implemented an automated revision program that iterates over each sentence
in the augmented training set. For each sample, GPT-4o was prompted with the original sentence, its
intended label, and associated style (e.g., partisan, emotional). The model was instructed to do nothing
if the sentence already matched the label and style. Otherwise, it rewrote the sentence to meet the
intended classification and style requirements, preserving the subject matter and keeping outputs under
25 words. This method ensured consistency in label-style alignment and improved stylistic clarity
without introducing content drift. The system was built in Python using LangChain’s ChatOpenAI[23]
wrapper and Polars[24] for input/output. The prompt template emphasized minimal intervention:
Correction Data Augmentation Prompt: You are an expert in rewriting
sentences to match specific subjectivity and style requirements.
Instructions:
- You will be given a sentence and its intended label ("SUBJ" or
"OBJ") and style.
- If the sentence already matches the label and style, return it
unchanged.
- If it does NOT match, rewrite the sentence so it reflects the
correct label and style.
- Always preserve the subject matter.
- Only apply style if the label is SUBJ. For OBJ, remove all
subjective language and opinion.
- Keep the rewritten sentence **under 25 words**.</p>
        <p>Now perform the task:
Label: {label}
Style: {style}
Sentence: "{sentence}"</p>
        <p>Response:
The model operated asynchronously with a concurrency cap to process thousands of samples eficiently,
with built-in error handling and whitespace normalization. While the correction step added complexity,
the pipeline remained eficient and reproducible. It was run locally using the OpenAI API through
LangChain, with GPT-4o rewriting only when it detected a mismatch between a sentence’s content and
its assigned label or style. The process completed in a few minutes in practice and outputs were saved
as .tsv files with no manual edits. The full pipeline also ran on Georgia Tech’s PACE cluster, ensuring
consistent results under controlled API and compute conditions.</p>
      </sec>
      <sec id="sec-3-2">
        <title>This process yielded two corrected datasets:</title>
        <p>• Corrected Balanced-2: Based on the original Balanced-2 augmentation, where each sentence
had two style-flipped paraphrases.
• Corrected Balanced-6: Based on Balanced-6, with six paraphrases per sentence spanning
multiple style prompts.</p>
        <p>Initial Generation (Label) Corrected Version (Rationale)
The plight of Serbia’s LGBTQ+ community remains Serbia’s LGBTQ+ community is shockingly ignored,
largely unaddressed, leaving them in a void of neglect. casting them into an abyss of utter neglect! (increased
(SUBJ, exaggerated) rhetorical intensity for the exaggerated category)
A promising turnaround is on the horizon, with ex- A glorious transformation awaits us, with change
despectations for change as early as next month. (SUBJ, tined to arrive as soon as next month! (intensified tone
propaganda) to better fit propaganda category)
He expressed that a new variant emerging this fall The emergence of a new variant this fall is inevitable
would not come as a shock to him. (SUBJ, propaganda) and will not surprise the vigilant. (rewritten to sound
more declarative and assertive to better fit propaganda
category)</p>
        <p>We then fine-tuned Sentiment-Analysis-BERT and Emotion-English-DistilRoBERTa-base on these
cleaned and enhanced datasets. As can be seen in Table 4, this refinement led to improved macro F1
scores and more stable class-wise performance, particularly in subjective detection.
Hypothesis H3: Self-corrected data-augmentation increases the quality of synthetic data and
therefore the fine-tuned classifier performance.</p>
      </sec>
      <sec id="sec-3-3">
        <title>All results of these described methods can be found in Section 5.</title>
        <p>3.1. Evaluation
Our models were fine-tuned and evaluated on the macro-averaged F1-measure (macro F1):
where  1 is the the class-wise F1 score:

Macro-F1 = 1 ∑︁  1</p>
        <p>=1
 1 = 2</p>
        <p>× 
×  + 
(1)
(2)
where  is the recall of class  and  is the precision of class .</p>
        <p>Training was performed on a Tesla V100-16GB on the Phoenix cluster of Georgia Tech’s Partnership
for an Advanced Computing Environment[25] or locally on an Apple M3 Pro GPU-36GB and Metal
Performance Shaders.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Data</title>
      <p>We used the English dataset provided for Task 1 of CheckThat! 2025, which consisted of labeled
sentences from news articles. Each sentence was annotated as either objective (OBJ) or subjective
Train
222 240</p>
      <p>Dev</p>
      <p>Dataset Split
Objective (OBJ) Subjective (SUBJ)
(SUBJ). The dataset was divided into training, development (dev), and test (test) sets. The distribution
of class labels across these splits is shown in Figure 4.</p>
      <p>The train dataset had substantially more objective sentences (523) instead of subjective sentences
(298), but this diference was insuficient to be determined as a significant class imbalance. This did not
align with the dev set, which we used as our validation set. Here, the number of objective (222) and
subjective (240) examples was more balanced. Below are representative examples from each class:
• Subjective: “Gone are the days when they led the world in recession-busting.”
• Objective: “The trend is expected to reverse as soon as next month.”</p>
      <p>The subjective sentence showed a personal interpretation or opinion, reflecting the speaker’s
sentiment toward a past event. In contrast, the objective sentence presented factual information or predictions
without personal bias. This distinction illustrates that subjective statements often involve emotional or
evaluative language, while objective statements rely on logical reasoning.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>Table 3 highlights our main results for the general-purpose encoders and domain-specific encoders that
we used. RoBERTa-base achieved 0.70 macro-F1 on train, 0.75 macro-F1 on validation, and 0.65 macro-F1
on test after fine-tuning for 3 epochs with a learning rate  of 1e-4 on the original train data set. During
validation, the RoBERTa-base model demonstrates a balanced ability to identify objective and subjective
sentences. However, this dropped during test, where the subjective F1 (0.58) was significantly lower
than objective F1 (0.72).</p>
      <p>MiniLM-L6-v2 showed weaker generalization across splits, achieving 0.81 macro-F1 on train, 0.69
on validation, and 0.64 on test after fine-tuning for 3 epochs with a learning rate of  1e-4. It had
dificulties with the recognition of objective sentences during validation (only 0.58 F1), though test
performance saw a drop in subjective F1 (0.51) compared to objective F1 (0.76), indicating a slight bias
toward objective language cues.</p>
      <p>ModernBERT-large struggles with subjective classification, achieving 0.41 macro-F1 on train and 0.41
on test, despite showing high objective F1 on test (0.84). After being fine-tuned for 2 epochs with  =
2e-5, it completely failed to capture subjective expressions during validation and test, suggesting overfit
and limited adaptability to subjective nuances in the training data.
0.87 macro-F1 on train, 0.64 on validation, and 0.67 on test with 4 epochs and  2e-5. Although validation
subjective F1 (0.53) was notably lower than objective F1 (0.71), it generalized better on test with balanced
F1 scores (0.58 and 0.77).</p>
      <p>Emotion-English-DistilRoBERTa-base achieved the best performance among all models with 0.90
macro-F1 on train and 0.77 macro-F1 on validation. It also showed strong and balanced validation scores
for both objective (0.78) and subjective (0.75) classification. After being fine-tuned for 6 epochs with

2e-4, it sustained decent test performance (0.68 macro-F1), although subjective F1 dropped to 0.57. This
model was used for submission and placed us 16ℎ out of 24.</p>
      <p>Emotion-English-RoBERTa-large showed similar performance to the distilled model before: 0.67
macroF1 on test after 7 epochs with  2e-5. It had similar validation results (0.77 macro-F1), and a very high
train performance (macro-F1: 0.97).
macro-F1 scores (above 0.90), indicating strong fit to the training data across all configurations.</p>
      <p>For Emotion-English-DistilRoBERTa-base, the highest validation macro-F1 was achieved with the
Corrected Balanced-2 dataset (0.74), where objective and subjective F1 scores were 0.71 and 0.67,
respectively. Corrected Balanced-6 performed slightly worse, but remained strong (0.71 macro-F1). The
Balanced-2 dataset without editing resulted in a lower macro-F1 (0.68), while Balanced-6 showed the
weakest performance overall (0.64), with an especially low objective F1 (0.57).</p>
      <p>Sentiment-Analysis-BERT showed greater variability. Corrected Balanced-6 yielded the best overall
macro-F1 (0.74) with balanced class performance. Corrected Balanced-2 produced a slightly higher
macro-F1 (0.75), but this was driven by a strong objective F1 (0.68) and a low subjective F1 (0.36),
indicating an imbalance in class predictions. The Balanced-2 dataset without editing led to more
balanced class scores (0.64 and 0.65) and a moderate macro-F1 of 0.672. Balanced-6 performed the worst
(0.63 macro-F1), with notably lower objective performance (0.54).</p>
      <p>In general, editing improved consistency in subjective classification.
Emotion-English-DistilRoBERTabase performed reliably across both dataset sizes when editing was applied. In contrast,
SentimentAnalysis-BERT was more sensitive to augmentation type and prone to overfitting, particularly favoring
objective predictions when trained on the Corrected Balanced-2 dataset.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion</title>
      <p>Our results highlight the benefits and limitations of transfer-learning of already specified models and
data augmentation for subjectivity detection in news text. Among models trained only on the original
data, already fine-tuned encoders on related tasks, like Sentiment-Analysis-BERT and
Emotion-EnglishDistilRoBERTa-base, outperformed general pre-trained models in macro-F1 on validation and test set. In
addition, they achieve a better performance over the general-purpose encoders on the test set after
finetuning. This suggests that pretraining on sentiment and emotion-related corpora improves sensitivity
to subjective linguistic cues and confirms Hypothesis H1.</p>
      <p>Gains from augmentation were not uniform. While adding more paraphrases (Balanced-6) increased
training scores, it did not always translate to better validation performance. In some cases, especially
without correction, augmentation degraded performance due to inconsistencies between the generated
sentence and its intended label. Clearly, the model performance sufered due to overfit on noise, which
confirms past findings that LLM-generated data can introduce noise without suficient validation.
Therefore, Hypothesis H2 needs to be rejected as fine-tuning with augmented train data did not
improve performance.</p>
      <p>Applying a correction pipeline improved validation macro-F1 for both models.
Emotion-EnglishDistilRoBERTa-base showed consistent gains across datasets, with Corrected Balanced-2 and Corrected
Balanced-6 outperforming their uncorrected counterparts. Sentiment-Analysis-BERT exhibited more
volatile behavior. Its macro-F1 improved in Corrected Balanced-6 with balanced class performance, but
Corrected Balanced-2 showed a misleading macro-F1 gain driven by high objective performance, while
subjective performance dropped substantially. Among all configurations, Corrected Balanced-6 yielded
the most stable performance across classes for both models. This suggests that correcting label and
style alignment in synthetic data improves generalization and robustness, confirming Hypothesis H3.</p>
      <p>Importantly, these improvements were not just a result of adding more data, but of the generative
model self-refining the augmented data to ensure semantic and stylistic alignment. The second-stage
editing process using GPT-4o, which rewrote misaligned samples while preserving subject matter and
label intent, played a key role in reducing label noise and improving model reliability. This distinction
between simple augmentation and corrected augmentation was critical to achieving consistent gains.</p>
      <p>Although the models fine-tuned on Corrected Balanced-2 and Corrected Balanced-6 performed best
overall, we were not able to submit them to the shared task because results were finalized after the
submission deadline. We submitted results with Emotion-English-DistilRoBERTa-base fine-tuned on the
original train dataset which resulted in a 0.68 test macro-F1 and placed us on rank 16 out of 24 in the
monolingual-english competition. Therefore, our submission is significantly better than the organizers
baseline (test macro-F1: 0.54) but leaves room for improvement due to the top competitor achieving
a 0.81 macro-F1 on the test set. Overall, these findings show that combining transfer-learning with
encoders with carefully curated synthetic examples can improve performance in low-resource tasks
like subjectivity detection, as long as the synthetic data is reliable and label-consistent.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Future Work</title>
      <p>The research can be extended by applying our approach to multilingual settings, leveraging languages
such as Arabic, Bulgarian, German, Italian, Spanish, and French included in Task 1. Our approach
can further be improved by incorporating more data, e.g. the dev dataset, into the model fine-tuning
for submission. More labeled data can improve the quality of the fine-tuned classifier. Also, the
contrastive learning approach can be enhanced by incorporating more specialized loss-functions for the
model. Furthermore, we have observed models with varying qualities. For example,
Emotion-englishRoBERTa-large has demonstrated a greater capability of correctly identifying objective sentences while
Emotion-english-DistilRoBERTa-base has superior performance for subjective sentences. An ensembling
approach of these models could further enhance the classification performance and is an open research
avenue. Our contribution to subjectivity detection may be integrated into the broader fact checking
framework via misinformation analysis, bias assessment, and claim verification workflows .</p>
    </sec>
    <sec id="sec-8">
      <title>8. Conclusions</title>
      <p>In this paper, we explored subjectivity detection in news text through the lens of transfer-learning
and data augmentation. Our findings highlight three key insights: First, domain-specific encoder
models already fine-tuned on sentiment or emotion datasets consistently outperformed general
pretrained encoders, supporting our hypothesis that transfer learning can enhance sensitivity to subjective
language. Second, while naive data augmentation introduces inconsistencies that occasionally degraded
performance, a "self-corrective" pipeline using GPT-4o significantly improved the quality and label
alignment of synthetic examples. This allowed models to generalize better and increased macro-F1
scores, particularly for the harder-to-detect subjective class. Third, although our final submission was
constrained to uncorrected data due to time limits, our post-submission results demonstrate the value
of integrating high-quality synthetic training data. Overall, our approach underscores the importance
of not only augmenting data, but shows the ability of a generative model to check itself for semantic
and stylistic fidelity. Combining specialized encoders with refined augmentation holds promise for
improving low-resource NLP tasks like subjectivity detection.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgements</title>
      <p>We thank the DS@GT CLEF team for providing valuable comments and suggestions. This research
was supported in part through research cyberinfrastructure resources and services provided by the
Partnership for an Advanced Computing Environment (PACE) at the Georgia Institute of Technology,
Atlanta, Georgia, USA.</p>
    </sec>
    <sec id="sec-10">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used OpenAI-GPT-4o in order to: Grammar and
spelling check. After using this tool, the authors reviewed and edited the content as needed and take
full responsibility for the publication’s content.
[22] A. Awadallah, A. Trabelsi, Clac at checkthat! 2024: Sentence-level subjectivity classification using
zero-shot gpt-3 and majority voting, in: CLEF 2024 Working Notes, 2024.
[23] L. AI, L. Contributors, Langchain: A python package for natural language processing, https:
//github.com/langchain-ai/langchain, 2025.
[24] P. BV, P. Contributors, Polars: A fast dataframe library implemented in rust, https://github.com/
pola-rs/polars, 2025.
[25] PACE, Partnership for an Advanced Computing Environment (PACE), 2017. URL: http://www.
pace.gatech.edu.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Elsner</surname>
          </string-name>
          , G. Atkinson,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zahidi</surname>
          </string-name>
          ,
          <source>Global Risks Report</source>
          <year>2025</year>
          ,
          <string-name>
            <given-names>Technical</given-names>
            <surname>Report</surname>
          </string-name>
          , World Economic Forum, Geneva,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dietze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hafid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Korre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Muti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schellhammer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Setty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Todorov</surname>
          </string-name>
          ,
          <string-name>
            <surname>V. V.</surname>
          </string-name>
          ,
          <article-title>The clef-2025 checkthat! lab: Subjectivity, fact-checking, claim normalization, and retrieval</article-title>
          , in: C.
          <string-name>
            <surname>Hauf</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Jannach</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Kazai</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Nardini</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Pinelli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Silvestri</surname>
          </string-name>
          , N. Tonellotto (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer Nature Switzerland, Cham,
          <year>2025</year>
          , pp.
          <fpage>467</fpage>
          -
          <lpage>478</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Muti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Korre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Biswas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nawrocka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ivasiuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Razvan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mihail</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF-2025 CheckThat! lab task 1 on subjectivity in news article</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , D. Spina (Eds.), Working Notes of CLEF 2025 -
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          ,
          <string-name>
            <surname>CLEF</surname>
          </string-name>
          <year>2025</year>
          , Madrid, Spain,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Galassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kutlu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Antici</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Köhler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Korre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Leistra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Muti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Türkmen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          , W. Zaghouani,
          <article-title>Notebook for the CheckThat! Lab at CLEF 2023</article-title>
          ,
          <source>in: Proceedings of the Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum (CLEF)</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dimitrov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galassi</surname>
          </string-name>
          , G. Pachov,
          <string-name>
            <surname>I. Koychev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          , W. Zaghouani,
          <article-title>Notebook for the CheckThat! Lab at CLEF 2024</article-title>
          ,
          <source>in: Proceedings of the Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum (CLEF)</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wiebe</surname>
          </string-name>
          , E. Rilof,
          <article-title>Creating subjective and objective sentence classifiers from unannotated texts</article-title>
          , in: A.
          <string-name>
            <surname>Gelbukh</surname>
          </string-name>
          (Ed.),
          <source>Computational Linguistics and Intelligent Text Processing</source>
          , Springer Berlin Heidelberg,
          <year>2005</year>
          , pp.
          <fpage>486</fpage>
          -
          <lpage>497</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L. L.</given-names>
            <surname>Vieira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L. M.</given-names>
            <surname>Jeronimo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. E. C.</given-names>
            <surname>Campelo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. B.</given-names>
            <surname>Marinho</surname>
          </string-name>
          ,
          <article-title>Analysis of the subjectivity level in fake news fragments</article-title>
          ,
          <source>in: Proceedings of the Brazilian Symposium on Multimedia and the Web</source>
          , WebMedia '20,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>233</fpage>
          -
          <lpage>240</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Banea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wiebe</surname>
          </string-name>
          ,
          <article-title>Learning multilingual subjective language via cross-lingual projections</article-title>
          , in: A.
          <string-name>
            <surname>Zaenen</surname>
          </string-name>
          , A. van den Bosch (Eds.),
          <source>Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics</source>
          , Association for Computational Linguistics, Prague, Czech Republic,
          <year>2007</year>
          , pp.
          <fpage>976</fpage>
          -
          <lpage>983</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Ł. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Luan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          ,
          <article-title>Language models are unsupervised multitask learners (</article-title>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized bert pretraining approach (</article-title>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>I. B.</given-names>
            <surname>Schlicht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Khellaf</surname>
          </string-name>
          , D. Altiok, DWReCO at CheckThat! 2023:
          <article-title>Enhancing Subjectivity Detection through Style-based Data Sampling</article-title>
          ,
          <source>in: Proceedings of the Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum (CLEF)</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kornblith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Norouzi</surname>
          </string-name>
          , G. Hinton,
          <article-title>Simclr: A simple framework for contrastive learning of visual representations</article-title>
          ,
          <source>in: Proceedings of the 37th International Conference on Machine Learning (ICML)</source>
          , PMLR,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Casanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chanson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Icard</surname>
          </string-name>
          , G. Faye, G. Gadek, G. Gravier,
          <string-name>
            <given-names>P.</given-names>
            <surname>Égré</surname>
          </string-name>
          ,
          <article-title>Notebook for the HYBRINFOX Team at CheckThat! 2024 - Task 2</article-title>
          ,
          <source>in: Proceedings of the Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum (CLEF)</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>B.</given-names>
            <surname>Warner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chafin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Clavié</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Weller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hallström</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Taghadouini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gallagher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Biswas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ladhak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Aarsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cooper</surname>
          </string-name>
          , G. Adams,
          <string-name>
            <given-names>J.</given-names>
            <surname>Howard</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Poli</surname>
          </string-name>
          , Smarter, better, faster, longer
          <article-title>: A modern bidirectional encoder for fast, memory eficient, and long context finetuning</article-title>
          and inference,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ange</surname>
          </string-name>
          ,
          <string-name>
            <surname>Semtiment-</surname>
          </string-name>
          analysis-bert, https://huggingface.co/MarieAngeA13/
          <article-title>Sentiment-Analysis-</article-title>
          <string-name>
            <surname>BERT</surname>
          </string-name>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hartmann</surname>
          </string-name>
          ,
          <article-title>Emotion-english-distilroberta-base</article-title>
          , https://huggingface.co/j-hartmann/ emotion-english
          <string-name>
            <surname>-</surname>
          </string-name>
          distilroberta-base/,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>OpenAI</surname>
          </string-name>
          , Gpt-4
          <source>technical report</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sriram</surname>
          </string-name>
          , E. Choi, G. Durrett,
          <article-title>Generating literal and implied subquestions to fact-check complex claims</article-title>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>