<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Enhancing Fact-Checking: From Crowdsourced Validation to Integration with Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kevin Roitero</string-name>
          <email>kevin.roitero@uniud.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Soprano</string-name>
          <email>michael.soprano@uniud.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David La Barbera</string-name>
          <email>david.labarbera@uniud.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eddy Maddalena</string-name>
          <email>eddy.maddalena@uniud.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Mizzaro</string-name>
          <email>stefano.mizzaro@uniud.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Udine</institution>
          ,
          <addr-line>Udine</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This extended abstract presents results from two recent studies [1, 2] aimed at enhancing the practical application and efectiveness of fact-checking systems. La Barbera et al. [ 1] detail the implementation of crowdsourcing in fact-checking, demonstrating its practical viability through experimental evaluation using a dataset of political public statements. Zeng et al. [2] build on this foundation by integrating crowdsourced data with Large Language Models, proposing the first hybrid system that combines human insights and AI capabilities.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Misinformation</kwd>
        <kwd>Fact Checking</kwd>
        <kwd>Large Language Models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        1. Introduction
The rapid proliferation of misinformation across digital platforms poses significant challenges
to societal trust and public safety. Traditional fact-checking methods, predominantly based
on experts, are unable to cope with the ever-increasing volume and speed of misinformation
dissemination. This has created interest in the development of more scalable solutions that are
able to enhance the accuracy and eficiency of misinformation detection methods, addressing
the problem at scale. Crowdsourcing has emerged as a powerful tool in this domain [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6 ref7">3, 4, 5, 6, 7</xref>
        ],
based on the usage of the collective wisdom of the crowd [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] applied to fact verification. While
promising, the application of crowdsourcing in fact-checking requires careful consideration of
factors such as task design, worker motivation, and data quality to ensure its efectiveness.
      </p>
      <p>
        In this work, we present results from two recent studies in the field of crowdsourced
factchecking [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Specifically, La Barbera et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] explore the potential of crowdsourcing to
provide a robust foundation for practical fact-checking applications at scale. Moreover, the
efectiveness of crowdsourcing can be significantly enhanced by leveraging Large Language
Models (LLMs), which provide powerful complementary support to human eforts. By combining
LLMs with human-generated annotations, Zeng et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] introduce a novel hybrid approach
developed to address misinformation at scale.
Pants
On Fire
      </p>
    </sec>
    <sec id="sec-2">
      <title>False MFoaslstely THraulef</title>
      <sec id="sec-2-1">
        <title>Ground Truth</title>
        <p>Mostly True
True
False
True</p>
      </sec>
      <sec id="sec-2-2">
        <title>Ground Truth</title>
        <p>
          2. Crowdsourced Fact-checking: Does It Actually Work?
Objective and Methodology. The primary objective of La Barbera et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] was to develop
a viable crowdsourcing approach to fact-checking. The methodology employed involved a
re-design of the crowdsourcing task used in previous studies. A diverse group of workers
was recruited through the Prolific platform. Participants were given tasks involving a curated
dataset of political statements from the PolitiFact1 website and were instructed to fact-check
each statement using the same six-level scale used by experts.
        </p>
        <p>
          Results. This study highlighted the efectiveness of crowdsourcing for fact-checking,
demonstrating higher efectiveness when compared to previous studies [
          <xref ref-type="bibr" rid="ref6 ref9">6, 9</xref>
          ]. La Barbera et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
compared workers’ agreement levels across diferent studies. Figure 1 presents the results, with
the three series replicating the findings from previous research by Soprano et al., Draws et al.,
and their study. For each series in the figure, the x-axis displays each statement ground truth
level, while the y-axis represents the mean of assessments as supplied by the crowd. Small
dots represent individual statements, with larger markers and marking the median values for
statements at each truthfulness level. La Barbera et al. observed a distinct trend where the
median aggregated truthfulness values consistently increased with higher ground truth levels,
indicating a clear relationship between the perceived truthfulness of statements as perceived by
the crowd and the actual one provided by experts. This pattern was less evident in the studies by
Soprano et al. and Draws et al., where median values decreased in some cases, such as moving
from Pants-On-Fire to False truthfulness levels. Further validation of our results was confirmed
trough a statistically significant distinction among them: our study difered significantly from
both Soprano et al. ( &lt; 0.01) and Draws et al. ( &lt; 0.05), whereas no significant diferences
were observed between the latter two studies.
0.9
cy0.8
a
r
u
c
cA0.7
0.6
0.5
Worker
First
        </p>
        <p>S6
Hard
Vote</p>
        <p>Soft
Vote</p>
        <p>Mean Median</p>
        <p>Meta
Vote</p>
        <p>Model
First</p>
        <p>Hard
Vote</p>
        <p>Soft
Vote</p>
        <p>Mean Median</p>
        <p>Meta
Vote</p>
        <p>Model
First</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Worker 0.0</title>
      <p>First</p>
      <p>S6 3.0
2.5
2.0 SEM
1.5
1.0</p>
      <p>
        Discussion. The results of La Barbera et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] detail a significant advancement in the
application of crowdsourcing for fact-checking, achieved through an improved task design. The clear
trend observed in the increasing truthfulness assessments corresponding with higher ground
truth levels, particularly in our study compared to previous ones, underscores the efectiveness
of the crowd. Such improvements have not only boosted the accuracy of assessments but
also demonstrated the reliability of crowdsourcing as a practical and efective tool against
misinformation.
      </p>
      <p>
        The notable diference in La Barbera et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]’s study compared to prior research, as
statistically validated, suggests that task framing in crowdsourcing approaches can afect the
outcomes of fact-checking tasks. Moreover, specific design choices can significantly enhance
performance. It also highlighted that, while current results are promising, the complexity of
fact-checking as a task are high. In conclusion, the presented study not only contributes to the
empirical understanding and validation of crowdsourcing in fact-checking but also provides a
foundational framework for future improvements in this area.
3. Combining Large Language Models and Crowdsourcing for
      </p>
      <p>
        Hybrid Human-AI Misinformation Detection
Objective and Methodology. The primary objective of Zeng et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] was to enhance the
efectiveness and eficiency of fact-checking by developing a hybrid system that integrates
crowdsourced data with outputs from language and learning models (LLMs). The methodology
involved processing the dataset of statements evaluated by crowdsourced workers in a previous
study, with the addition of analysis from fine-tuned LLMs such as BERT [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], RoBERTa [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],
and DeBERTa [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>To integrate human and AI insights, several combination strategies were explored, including
simple averaging, weighted averaging based on confidence scores, and more complex ensemble
methods tailored to the specific characteristics of the data. The performance of the hybrid
system was evaluated using standard metrics like accuracy, precision, recall, and F1-score,
supplemented by detailed error analysis to fine-tune the integration approach and ensure
high-quality fact-checking.</p>
      <p>Results. The integration of crowdsourced data with LLMs provided significant insights
into the potential of hybrid systems for misinformation detection. The performance was
quantitatively evaluated using two scales, S2 and S6, reflecting simpler and more complex
judgment scales, respectively.</p>
      <p>Figure 2 provide a detailed comparison of accuracy and Mean Squared Error (MSE) across
individual models, crowdsourced data, and their integration. For the S2 scale, the results
demonstrate consistent model performance with an accuracy around 0.7. In contrast, crowdsourced
judgments maintained a higher accuracy of 0.816, surpassing individual model performances
across all aggregation methods except soft-voting. For the S6 scale, the models achieved the best
accuracy using hard-voting, soft-voting, and median aggregation methods (up to 0.441). The
most efective hybrid combination on the S2 scale was achieved using the Meta Vote method,
which delivered an outstanding accuracy of 0.875 and the lowest error rates (MSE 0.125). For
the S6 scale, the median aggregation provided the best accuracy (0.441), indicating that simpler
aggregation methods might be more efective for tasks with a more fine-grained scale.</p>
      <p>
        Further analysis involved evaluating classification diferences across various truthfulness
levels using confusion matrices (not shown), which revealed that models were more consistent in
classifying middle scale values like Mostly-False, Half-True, and Mostly-True but struggled with
extreme categories such as Pants-On-Fire and True. Conversely, crowdsourced data showed
a better capability to identify these extremes, indicating a deeper contextual understanding
of the statements. The hybrid combinations provided more balanced judgments across the
truthfulness spectrum of the S6 scale. While these combinations did not always outperform
other methods in terms of raw accuracy or error rates, they ofered a more subtle and robust
classification, crucial for tasks requiring a sophisticated understanding of truthfulness.
Discussion. The integration of crowdsourced data with LLMs proposed by Zeng et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] in
their hybrid system for misinformation detection yields insightful results that underscore the
complexity and potential of such approaches. This study not only demonstrates the viability
of hybrid models in enhancing fact-checking accuracy but also reveals the complex interplay
between diferent aggregation methods and the nature of the task.
      </p>
      <p>
        Zeng et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] suggest that hybrid models, which combine human judgment and machine
intelligence, can significantly improve the reliability and accuracy of misinformation detection
across various scales. The superior performance of the Meta Vote method in simpler judgment
tasks (S2 scale) and the median method in more complex scenarios (S6 scale) highlights the
importance of selecting appropriate aggregation strategies based on the task’s specific
requirements. This adaptability is crucial in real-world applications where the type and complexity of
misinformation can vary greatly.
      </p>
      <p>Acknowledgments. This research is partially supported by the European Union’s NextGenerationEU PNRR
M4.C2.1.1 – PRIN 2022 project “20227F2ZN3 MoT–The Measure of Truth: An Evaluation-Centered Machine-Human
Hybrid Framework for Assessing Information Truthfulness” - 20227F2ZN3_001 – CUP G53D23002800006, and by
the Strategic Plan of the University of Udine–Interdepartmental Project on Artificial Intelligence (2020-25).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>La Barbera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Maddalena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Soprano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roitero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Demartini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ceolin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mizzaro</surname>
          </string-name>
          , Crowdsourced Fact-checking: Does It Actually Work?,
          <source>Information Processing &amp; Management</source>
          <volume>61</volume>
          (
          <year>2024</year>
          )
          <article-title>103792</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.ipm.
          <year>2024</year>
          .
          <volume>103792</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. La</given-names>
            <surname>Barbera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roitero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zubiaga</surname>
          </string-name>
          , S. Mizzaro,
          <article-title>Combining Large Language Models and Crowdsourcing for Hybrid Human-AI Misinformation Detection</article-title>
          ,
          <source>in: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '24</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA,
          <year>2024</year>
          , p.
          <fpage>0</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Arechar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pennycook</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Rand</surname>
          </string-name>
          ,
          <article-title>Scaling Up Fact-Checking Using the Wisdom of Crowds</article-title>
          ,
          <source>Science Advances</source>
          <volume>7</volume>
          (
          <year>2021</year>
          )
          <article-title>eabf4393</article-title>
          . doi:
          <volume>10</volume>
          .1126/sciadv.abf4393.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Martel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Rand</surname>
          </string-name>
          ,
          <article-title>Birds of a Feather Don't Fact-Check Each Other: Partisanship and the Evaluation of News in Twitter's Birdwatch Crowdsourced Fact-Checking Program</article-title>
          ,
          <source>in: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, CHI '22</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA,
          <year>2022</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          . doi:
          <volume>10</volume>
          .1145/3491102.3502040.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Roitero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Soprano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mizzaro</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Demartini, Can The Crowd Identify Misinformation Objectively? The Efects of Judgment Scale and Assessor's Background</article-title>
          ,
          <source>in: Proceedings of the 43st International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          .,
          <source>SIGIR '20</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>439</fpage>
          -
          <lpage>448</lpage>
          . doi:
          <volume>10</volume>
          . 1145/3397271.3401112.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Soprano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roitero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. La</given-names>
            <surname>Barbera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ceolin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mizzaro</surname>
          </string-name>
          , G. Demartini,
          <source>The Many Dimensions of Truthfulness: Crowdsourcing Misinformation Assessments on a Multidimensional Scale, Information Processing &amp; Management</source>
          <volume>58</volume>
          (
          <year>2021</year>
          )
          <article-title>102710</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.ipm.
          <year>2021</year>
          .
          <volume>102710</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>La Barbera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Soprano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roitero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Maddalena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mizzaro</surname>
          </string-name>
          ,
          <article-title>Fact-Checking at Scale with Crowdsourcing: Experiments and Lessons Learned</article-title>
          ,
          <source>in: Proceedings of the 13th Italian Information Retrieval Workshop</source>
          , volume
          <volume>3448</volume>
          , CEUR-WS.org,
          <year>2023</year>
          , pp.
          <fpage>85</fpage>
          -
          <lpage>90</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3448</volume>
          /paper-18.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Howe</surname>
          </string-name>
          ,
          <source>The Rise of Crowdsourcing, Wired Magazine</source>
          <volume>14</volume>
          (
          <year>2006</year>
          )
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          . URL: https://www. wired.com/
          <year>2006</year>
          /06/crowds/.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Draws</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. La</given-names>
            <surname>Barbera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Soprano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roitero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ceolin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Checco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mizzaro</surname>
          </string-name>
          ,
          <article-title>The Efects of Crowd Worker Biases in Fact-Checking Tasks</article-title>
          , in: 2022 ACM Conference on Fairness, Accountability, and Transparency,
          <source>FAccT '22</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , Seoul, Republic of Korea,
          <year>2022</year>
          , p.
          <fpage>2114</fpage>
          -
          <lpage>2124</lpage>
          . doi:
          <volume>10</volume>
          .1145/3531146.3534629.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference of the North American Chapter of the ACL: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers), ACL, Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . doi:
          <volume>10</volume>
          .18653/ v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          , V. Stoyanov,
          <string-name>
            <surname>RoBERTa: A Robustly Optimized BERT Pretraining Approach</surname>
          </string-name>
          ,
          <year>2019</year>
          . doi:
          <volume>10</volume>
          . 48550/arXiv.
          <year>1907</year>
          .
          <volume>11692</volume>
          . arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , W. Chen, DeBERTa: Decoding-enhanced
          <source>BERT with Disentangled Attention</source>
          ,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>2006</year>
          .
          <volume>03654</volume>
          . arXiv:
          <year>2006</year>
          .03654.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>