<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Conference and Labs of the Evaluation Forum, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Integrating Dual BERT Models and Causal Language Models for Enhanced Detection of Machine-Generated Texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jitong Chen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leilei Kong</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Foshan Univisity</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Foshan</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>China</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>0</volume>
      <fpage>9</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>In the task of detecting machine-generated texts, accurately distinguishing between those created by artificial intelligence and those authored by humans is crucial. In this evaluation, we use a method that integrates two BERT models with a causal language model specifically trained on in-distribution (ID) samples. This integration enhances the performance of individual models in distinguishing between machine-generated and human-authored texts. Experimental results indicate that our method achieves a certain level of efectiveness in distinguishing between machine-generated and human-authored texts.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Detecting Machine-generated Texts</kwd>
        <kwd>BERT Model</kwd>
        <kwd>Causal Language Model</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In recent years, the rapid development of large-scale language models such as ChatGPT and Claude has
increased the awareness and usage of artificial intelligence technologies among a wider audience. While
these advancements have improved eficiency in daily tasks, they have also brought about some negative
consequences, such as the unethical use of AI to generate academic papers or complete assignments[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Therefore, detecting machine-generated text has become crucial.
      </p>
      <p>
        The goal of machine text generation detection is to discern whether a text is generated by a machine
or by a human[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Defined by Voight-Kampf Generative AI Authorship Verification 2024, this task
involves evaluating a pair of texts to determine which one is human-generated and which one is
machine-generated. Each pair is assigned a score; a score below 0.5 indicates that the first text is
human-generated, while a score above 0.5 indicates that the second text is human-generated[
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ]. In
major AI competitions such as Kaggle, participants have showcased inspiring methods for machine
text generation detection. These methods mainly involve fine-tuning models in a supervised manner,
enriching datasets to maintain balanced data distribution and ensure efective feature learning, and
combining multiple models to distinguish between machine-generated and human-authored texts[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Building upon existing work and considering the excellent performance of BERT models in capturing
contextual information and the ability of causal language models to capture the causal relationships
in text generation better, we hypothesize that integrating BERT models and causal language models
can complement their weaknesses and reduce biases and errors that might occur with a single model.
Therefore, we use a method that integrates two BERT models with a causal language model specifically
trained on in-distribution (ID) samples.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Method</title>
      <p>
        Our methodology, as depicted in Figure1, involves training two distinct BERT models, namely BERT
Model A and BERT Model B, utilizing the same dataset but employing diferent processing techniques.
Subsequently, we exclusively train a causal language model using machine-generated texts, which can
compute the perplexity of a text segment. Perplexity is a metric used to evaluate the performance of a
language model. It represents the model’s ability to predict the next word1. Due to slight diferences
in causal logic between machine-generated and human-authored texts at the sentence or word level,
we classify a segment as machine-generated if the perplexity score computed by the causal language
model is low; otherwise, it is classified as human-authored[
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. We then integrate these three models
to identify machine-generated text[9].
      </p>
      <p>Here’s the detailed workflow: Initially, our method takes two text segments (referred to as text1
and text2) as input. These segments are then separated by [sep] and fed simultaneously into BERT
Model A and BERT Model B for evaluation, yielding label A and label B. Meanwhile, text1 and text2
are independently input into a causal language model. This model evaluates each input text segment
separately, computing their respective perplexity scores (Perplexity1 and Perplexity2). Subsequently,
using a function to generate the output result label C of the causal language model based on perplexity1
and perplexity2, if Perplexity1 is less than Perplexity2, label C is set to 1; otherwise, it is set to 0.</p>
      <p>Each model generates a label based on its analysis results, indicating which segment of text it believes
is more likely to be human-authored. Finally, through a voting mechanism, the evaluations from all
models are aggregated to determine the final output, identifying which segment of text is authored by a
human[10].</p>
      <p>text1
[SEP]</p>
      <p>text2
Bert Model A</p>
      <p>Causal Language Model</p>
      <p>Bert Model B
label A
label B
Perplexity 1</p>
      <p>Perplexity 2
Function
label C
VOTE
label D</p>
      <sec id="sec-2-1">
        <title>1https://huggingface.co/learn/nlp-course/chapter7/3</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <sec id="sec-3-1">
        <title>3.1. Datasets</title>
        <p>The datasets are sourced from the PAN24 generation-author-news and Kaggle’s
DAIGHT-v2-traindataset2. The PAN24 generation-author-news dataset is provided by the evaluation authority, while the
DAIGHT-v2-train-dataset is contributed by a participant in Kaggle’s machine-generated text detection
competition. The PAN24 dataset includes texts generated by 13 large language models and one text
authored by a human. Notably, these 13 machine-generated texts and the single human-written text
cover 1,087 descriptions of the same topics. The daigt-v2-train-dataset contains segments of
machinegenerated and human-authored texts on 15 same topics.</p>
        <p>Considering the token length limitations of large models, we have conducted a token count on these
data.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Causal Language Model Data Processing</title>
        <p>Our analysis of the dataset revealed that some texts exceed 512 tokens. To address this issue, we propose
a solution that involves segmenting the texts based on their length and then averaging the perplexity
scores of these segments to determine the final result. This method not only overcomes the token length
limitation but also has the potential to improve model accuracy. We established text segmentation rules
where each segment does not exceed 500 characters and is divided by punctuation marks.</p>
        <p>Through experiments, we observed that shorter texts significantly impact model accuracy, so we
discarded text segments shorter than 50 characters after segmentation. We randomly selected 1,087
unique topic text segments from 13 machine-generated text datasets and combined them with a
humanauthored text dataset to form a new dataset. This new dataset was then segmented according to the
text segmentation rules to create a validation set. The remaining machine-generated texts were also
segmented according to the same rules and used as the training set.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Bert model Data Processing</title>
        <p>In the PAN24-Generation-Author-News dataset, we paired the human-generated dataset with 13
machine-generated datasets by matching texts on the same topic. The human-generated texts were
designated as text1, and the machine-generated texts as text2, with a label assigned as 0. We applied
the same pairing method in the DAIGHT-v2-train dataset, combining human-generated texts with
machine-generated texts based on similar topics. After completing all pairings, we swapped the content</p>
        <sec id="sec-3-3-1">
          <title>2https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/455517</title>
          <p>of text1 and text2 for half of the data and changed the label to 1, thus forming Dataset A. Considering the
maximum number of tokens the model can accept and the relationship between the two text segments,
we swapped the content of text1 and text2 in Dataset A again and inverted the label values, creating
Dataset B. Both Dataset A and Dataset B were then split into training and validation sets in a 7:3 ratio.
Texts were always truncated at the end after concatenation. If the first text alone exceeded the BERT
token limit, it was treated as a single-text classification problem. The aforementioned text swapping
operations mitigated the potential adverse efects of this issue to some extent.</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Causal Language Model Experience Setting</title>
        <p>We employ the GPT-2 model as a causal language model to perform the task of predicting text perplexity.
During the training process, we opted for the AdamW optimizer, which utilizes an adaptive learning rate
suitable for weight decay regularization. In setting the optimizer parameters, we distinguished between
parameters that required decay and those that did not: ’bias’ and ’LayerNorm.weight’ parameters within
the model were exempt from weight decay, whereas all other parameters were subjected to it. The initial
learning rate was set at 5e-5, and we implemented a ’constant_with_warmup’ learning rate schedule.
This strategy starts with a warmup phase and subsequently maintains a constant learning rate. We
calculated the number of update steps per training epoch based on the batch size and the gradient
accumulation steps. Training cycles were adjusted inversely based on the maximum training steps to
ensure the completeness and consistency of the training. Each batch of data is first processed on the
device, followed by bidirectional forward propagation, meaning each batch undergoes two separate
forward propagations, yielding two sets of outputs. Additionally, to prevent gradient explosion, gradient
clipping techniques were applied. All experiments are conducted on NVIDIA A800 GPU with 80GB
memory with a batch size of 8.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Bert Model Experience Setting</title>
        <p>We trained the BERT model on Dataset A and Dataset B separately, resulting in Model A and Model
B, respectively. The BERT models were optimized using the AdamW optimizer, with the learning
rate set at 3e-5. The batch size was set to 8, and the models were trained for 3 epochs. Notably, our
experiments revealed that although this appears to be a binary classification task, setting the model’s
output categories to three significantly enhances its performance.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>In this evaluation task, our method surpasses the baseline in both the minimum and median scores, and
exceeds most baselines in the 25th quantile, 75th quantile, and maximum scores, as shown in Table 2.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This paper discusses a method used in the authorship verification task of generative AI in the PAN@CLEF
2024 competition, which integrates two trained BERT models with a causal language model. The results
indicate that our method is efective in distinguishing between machine-generated and human-authored
texts, leveraging the strengths of both types of models. However, our study has certain limitations, and
there is room for further improvement in the performance of individual models. This could involve
exploring alternative base models or adopting diferent training strategies to optimize the results.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work is supported by the National Social Science Foundation of China (No.22BTQ101) .
[9] F. Harrag, M. Debbah, K. Darwish, A. Abdelali, Bert transformer model for detecting arabic gpt2
auto-generated tweets, arXiv preprint arXiv:2101.09345 (2021).
[10] G. Dhaou, G. Lejeune, Comparison between voting classifier and deep learning methods for arabic
dialect identification, in: Proceedings of the Fifth Arabic Natural Language Processing Workshop,
2020, pp. 243–249.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Vasilatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rahwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maniatakos</surname>
          </string-name>
          ,
          <article-title>Howkgpt: Investigating the detection of chatgpt-generated university student homework through context-aware perplexity analysis</article-title>
          ,
          <source>arXiv preprint arXiv:2305.18226</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tonmoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. R.</given-names>
            <surname>Barman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gautam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chadha</surname>
          </string-name>
          , et al.,
          <article-title>Counter turing test ctˆ 2: Ai-generated text detection is not as easy as you may think-introducing ai detectability index</article-title>
          ,
          <source>arXiv preprint arXiv:2310.05030</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dürlich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Gogoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Talman</surname>
          </string-name>
          , E. Stamatatos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the “Voight-Kampf” Generative AI Authorship Verification Task at PAN</article-title>
          and
          <article-title>ELOQUENT 2024</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          , A. G. S. de Herrera (Eds.),
          <source>Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kolyada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Grahm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elstner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Loebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <article-title>Continuous Integration for Reproducible Shared Tasks with TIRA.io</article-title>
          , in: J.
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <string-name>
            <surname>Kruschwitz</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Caputo (Eds.),
          <source>Advances in Information Retrieval. 45th European Conference on IR Research (ECIR</source>
          <year>2023</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2023</year>
          , pp.
          <fpage>236</fpage>
          -
          <lpage>241</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>031</fpage>
          -28241-6_
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <article-title>Automatic detection of machine-generated text using pre-trained language models</article-title>
          ,
          <source>in: Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>159</fpage>
          -
          <lpage>163</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. F.</given-names>
            <surname>Karlsson</surname>
          </string-name>
          , C.-Y.
          <article-title>Lin, Multi-level knowledge distillation for out-ofdistribution detection in text</article-title>
          ,
          <source>arXiv preprint arXiv:2211.11300</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Junhui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Mengyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Erhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Jingran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yujie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liner</surname>
          </string-name>
          , -
          <article-title>- chatgpt (a comparative study of language between artificial intelligence and human: A case study of chatgpt)</article-title>
          ,
          <source>in: Proceedings of the 22nd Chinese National Conference on Computational Linguistics</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>523</fpage>
          -
          <lpage>534</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>