<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>T. Mandl)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Univ. of Hildesheim at AuTexTification 2023: Detection of Automatically Generated Texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tatjana Scheibe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Mandl</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information Science, University of Hildesheim</institution>
          ,
          <addr-line>Universitätsplatz 1, 31141 Hildesheim</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Text generation models can pose a challenge for the legitimacy and authenticity of texts. Large pretrained models have reached a high level of quality already. This paper presents experiments on classifying whether a text was written by a human or generated by a language model. The paper describes experiments within shared task AuTexTification: Automated Text Identification 2023. The approach is based on a pre-trained model. We selected the DeBERTaV2 model. Our run reached an Macro-F1 score of 67.2 and was ranked on position 15 out of 76 submissions for subtask 1. The paper also presents an analysis of both text classes based on text metrics. The observation of various readability metrics shows that the generated texts tend to show less diversity than human texts.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Text classification</kwd>
        <kwd>Transformer</kwd>
        <kwd>Chat-GPT</kwd>
        <kwd>Readability</kwd>
        <kwd>Evaluation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Text generation tools have become extremely powerful and there is a great need for the
identiifcation of generated text. Therefore, classifiers for making the distinction between text that
was written by humans and text which was generated by machines need to be developed and
evaluated. The shared task AuTexTification provides a testbed for such research [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. In an
experiment with this task, we developed a classifier based on a large pre-trained model in
order to analyze the capabilities of current large and geenrative language models. This paper
also intends to analyze the training dataset by quantifying the quality of the text based on
readability metrics [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] as well as other lexical metrics. A transparent analysis could be useful
for the explainability of text classifiers and for supporting the task of detecting unethical use of
language generation.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. State of the Art</title>
      <p>
        Large language models in NLP like BERT [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and GPT-3 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] have reached an elevated level
of quality in text analysis and text generation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Algorithms succeed in writing not only
sentences, but whole articles, including modifying the writing style. The best-performing
systems are currently based on transformers which process a sentence as a sequence of words
which can consider context between all words simultaneously [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Systems like BERT and
GPT-3 complement this basic idea by more complex techniques. BERT is trained to reconstruct
masked tokens within a sentence [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It can be applied to generate a sentence embedding which
can be used for next sentence prediction.
      </p>
      <p>
        These powerful tools will have consequences for several domains including literature [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
scientific writing [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and many other professional activities [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Online tools e.g., https:
//quillbot.com/, https://transformer.huggingface.co, https://philosopherai.com enable citizens
to work with and experiment AI technology, but also illustrate the limits of even the most
up-to-date systems. Thus, these systems succeed in producing grammatically well-formed texts,
but display weaknesses when it comes to coherence [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        Writing is a fundamental component of academic success in educational contexts. Writing
is central to our social identities and we are often evaluated by our control of it. The release
of ChatGPT (https://openai.com/blog/chatgpt) marks a turn in writing processes in its diverse
forms, as AI now influences writing to a considerably higher extent than previous
technologies. This will dramatically change our cultural practice(s) of writing [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Consequently, the
understanding of what it means to write, revise and post-edit is challenged. The use of AI
writing to augment human writing skills will have procedural, ethical and pedagogical
ramifications that are currently being debated in the media and in various contexts [13]. Particularly
within higher education, concerns have been raised about the potential impact of AI-based
writing on academic integrity, authorship recognition and critical source analysis. There is also
raising concern that AI writing tools could cause societal issues [14] e.g. due to the spread of
misinformation [15].
      </p>
      <p>Although there are great opportunities for a proper didactic use of language models [16, 17]
there are worries about inappropriate use in academia [18].</p>
      <p>The identification of authorship became an important topic. Can text classification technology
or can humans reliably detect machine-generated content? Since tools have become very
powerful and widely used, the identification of computer-generated text is more and more
relevant. Researchers have observed an increase in automatically generated content even in
scientific venues [19].</p>
      <p>Text classification experiments for distinguishing between human an machine generated
content have been promoted for several years [20]. E.g. within the Bot Profiling task in 2019 a F1
score of 0.96 was obtained [21]. Similar values above 0.9 were obtained for several architecture
on another collection, however, the authors admit that the systems are vulnerable for adversarial
attacks [22]. It has also been pointed out that the level of performance drops below 0.9 when
the texts are shorter than 64 characters [23]. Some first collections exist. Some are specific for
domains like misinformation [22] and scientific publications [24].</p>
      <p>Also OpenAI itself published a classifier which should identify generated text, however, the
company admits that it does not work perfectly well [25]. In an experiment with humans,
automatically generated reviews were perceived to be as fluent as human-written ones [ 26].
The cues which humans and computers might be quite diferent [ 27]. There are suggestions on
how to pursue a test for humans because the methodological setup can influence the outcome
[28]. For example, in one study, humans were asked to detect the boundary between human
and generated text within a document and performed badly [29].</p>
      <p>
        Despite the research available, more studies are necessary to analyse the diferences between
human-written and machine-generated text if there are any. For example, it is claimed that
automatically generated text uses common phrases more often [30]. The shared task AuTexTification
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] contributes to finding the best technologies which can classify successfully. Furthermore,
for assessing the features of texts and the quality of language generation, there is a need for
further metrics [31].
      </p>
      <p>In our study, we dedicate some efort to obtaining text metrics for both classes in the given
dataset. Such an analysis could reveal diferences between the two text classes.</p>
    </sec>
    <sec id="sec-3">
      <title>3. System Overview and Experiment</title>
      <p>Within the shared task AuTexTification, we submitted one run (team Stiftungsuni_Hildesheim)
for subtask 1 in English. We intended to solve the task with a pre-trained model in order to
judge its quality for such a classification task. For our experiment, we applied the DeBERTaV2
model. For fine-tuning purposes with the traning set, we used the AUTOTRAIN service by
Hugging Face to load the models and run training and evaluation sets. The best performance
was achieved by the DeBERTaV2 model on a text classification task.</p>
      <p>The system was fine-tuned with 3000 randomly chosen texts from the AuTEXTification train
dataset. We applied a reduced set in order to keep the computational load low.</p>
      <p>On the training set, we obtained the following performance values:
• Accuracy: 0.936
• Precision: 0.922
• Recall: 0.952
• AUC: 0.982
• F1: 0.937</p>
      <p>
        In the result ranking on the test data, the approach reached a Macro-F1 score of 67.2. This
drop suggests that the training adopted the system too strongly to the training set features. It is
likely, that our model did not perform well for the cross-domain generalization which was the
objective of the task [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Analysis and Discussion</title>
      <p>This section reports on our analysis of the text features of the training set. We included several
text metrics like readability metrics [32].</p>
      <p>The length of the texts in the training dataset ranges from 1 to 115 words. The size in texts
and the distribution over the classes is shown in table 1.</p>
      <p>The probability distribution of words follow Zipf’s Law in large corpora. There is no deviation
in the collection of generated texts which could indicate that the generative model does not
create language like humans. Figure 1 shows that the generated texts represent a perfect Zipf
distribution. Previous work confirmed this finding [33].</p>
      <p>Lexical diversity has also been considered as a key indicator of text quality [34]. It is often
used as a synonym to lexical richness or diversity. It is also used to assess human writing
[35]. For evaluating the text complexity, a measurement often used is lexical density, which is
measured by measures such as Type-Token Ratio (TTR) [36].</p>
      <p>We measured the ratio between types and tokens. It can be observed that the generated texts
cover a wider range whereas the human texts exhibit much higher values in the distribution
around the median value. This is illustrated by the boxplot for the distributions in Figure 2 as
well as the histogram in Figure 3.</p>
      <p>The same is the case for further metrics. We show the lexical diversity in Figure 4 and in
Figure 5. Figure 7 shows the lexical richness in both classes and Figure 6 shows the lexical
density. A boxplot of these metrics is given in Figure 8.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Readability Metrics</title>
      <p>Lexical metrics do not provide a full complexity analysis of the text. Readability metrics, like
Flesch Reading Ease [37] or Gunning Fox Index [38], are well-known in the US system and some
have been used for nearly a century to assess the dificulty of texts in schools. For example, the
Flesch Reading Ease measures the complexity of a text and returns values between between 1
and 100. 100 is considered very easy, while 1 is considered very dificult. It was developed by
Rudolf Flesch in the 1940s. These metrics focus on the length of texts and the length of words.
Recommendations for achieving good scores can be found online (e.g. at http://readable.com).</p>
      <p>Figure 9 and Figure 10 show that the generated text in the AuTexTification task does not
exhibit diferent values for the readability metrics. The models generate a range of sentences
with varying complexity. However, for some metrics, there seems to be a higher number of
texts with a score close to the average. These distributions show a higher peak for values close
to the medium when compared to the human generated texts.</p>
      <p>Figure 9: Comparison of several readability metrics for the two classes in the training set
Figure 10: Histogram for several readability metrics for the two classes in the training set</p>
    </sec>
    <sec id="sec-6">
      <title>6. Future Work</title>
      <p>As future work, we envision to improve our classification system. In addition, further text
metrics should be explored. Furthermore, we intend to conduct experiments with humans
[28] in order to find out how well humans perform for the domain and the texts selected for
AuTexTification.
[13] L. Li, Z. Ma, L. Fan, S. Lee, H. Yu, L. Hemphill, ChatGPT in education: A discourse analysis
of worries and concerns on social media, arXiv preprint arXiv:2305.02201 (2023).
[14] L. De Angelis, F. Baglivo, G. Arzilli, G. P. Privitera, P. Ferragina, A. E. Tozzi, C. Rizzo,
ChatGPT and the rise of large language models: the new AI-driven infodemic threat
in public health, Frontiers in Public Health 11 (2023) 1567. doi:10.3389/fpubh.2023.
1166120.
[15] T. Hsu, S. A. Thompson, Disinformation Researchers Raise Alarms About A.I. Chatbots,
2023. URL: https://www.nytimes.com/2023/02/08/technology/ai-chatbots-disinformation.
html.
[16] U. Bohle-Jurok, J. Baumgart, T. Mandl, Ki-basiertes Textfeedback in englischsprachigen
Lehrveranstaltungen (KI-TextengL), in: TextFeedBack in Praxis und Forschung: 3.
gemeinsame Tagung der gefsus, der GeWissS und des Forum Schreiben. 7. - 9. Sept., online.,
2023.
[17] J. M. Gayed, M. K. J. Carlon, A. M. Oriola, J. S. Cross, Exploring an AI-based writing
Assistant’s impact on English language learners, Computers and Education: Artificial
Intelligence 3 (2022) 100055. doi:10.1016/j.caeai.2022.100055.
[18] M. Liebrenz, R. Schleifer, A. Buadze, D. Bhugra, A. Smith, Generating scholarly content
with ChatGPT: ethical challenges for medical publishing, The Lancet Digital Health 5
(2023) e105–e106. doi:10.1016/S2589-7500(23)00019-5.
[19] B. A. Sabel, E. Knaack, G. Gigerenzer, M. Bilc, Fake publications in biomedical science:
Red-flagging method indicates mass production, medRxiv (2023). doi: 10.1101/2023.05.
06.23289563.
[20] M. S. Aljabri, R. Zagrouba, A. Shaahid, F. Alnasser, A. Saleh, D. M. Alomari, Machine
learning-based social media bot detection: a comprehensive literature review, Soc. Netw.
Anal. Min. 13 (2023) 20. URL: https://doi.org/10.1007/s13278-022-01020-5. doi:10.1007/
s13278-022-01020-5.
[21] F. M. R. Pardo, P. Rosso, Overview of the 7th Author Profiling Task at PAN 2019: Bots
and gender profiling in twitter, in: Working Notes of CLEF 2019 - Conference and Labs of
the Evaluation Forum, Lugano, Switzerland, Sept. 9-12, volume 2380 of CEUR Workshop
Proceedings, CEUR-WS.org, 2019. URL: https://ceur-ws.org/Vol-2380/paper_263.pdf.
[22] H. Stif, F. Johansson, Detecting computer-generated disinformation, International Journal
of Data Science and Analytics 13 (2022) 363–383. doi:10.1007/s41060-021-00299-5.
[23] A. Pagnoni, M. Graciarena, Y. Tsvetkov, Threat scenarios and best practices to detect
neural fake news, in: Proc. 29th Intl. Conference on Computational Linguistics, 2022, pp.
1233–1249. URL: https://aclanthology.org/2022.coling-1.106/.
[24] V. Liyanage, D. Buscaldi, A. Nazarenko, A benchmark corpus for the detection of
automatically generated text in academic publications, arXiv preprint arXiv:2202.02013
(2022).
[25] J. H. Kirchner, L. Ahmad, S. Aaronson, J. Leike, New AI classifier for indicating AI-written
text, 2023. URL: https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text.
[26] D. I. Adelani, H. Mai, F. Fang, H. H. Nguyen, J. Yamagishi, I. Echizen,
Generating sentiment-preserving fake online reviews using neural language models and their
human- and machine-based detection, in: Proc. 34th Intl. Conference on Advanced
Information Networking and Applications, AINA, Caserta, Italy, 15-17 April, volume
1151 of Advances in Intelligent Systems and Computing, Springer, 2020, pp. 1341–1354.
doi:10.1007/978-3-030-44041-1\_114.
[27] D. Ippolito, D. Duckworth, C. Callison-Burch, D. Eck, Automatic detection of generated
text is easiest when humans are fooled, in: Proc. 58th Annual Meeting of the Association
for Computational Linguistics, ACL, Online, July 5-10, 2020, pp. 1808–1822. doi:10.18653/
v1/2020.acl-main.164.
[28] C. van der Lee, A. Gatt, E. van Miltenburg, S. Wubben, E. Krahmer, Best practices for
the human evaluation of automatically generated text, in: Proc. 12th Intl. Conference on
Natural Language Generation, Association for Computational Linguistics, Tokyo, Japan,
2019, pp. 355–368. doi:10.18653/v1/W19-8643.
[29] L. Dugan, D. Ippolito, A. Kirubarajan, C. Callison-Burch, RoFT: A tool for evaluating
human detection of machine-generated text, in: Proc. Conference on Empirical Methods
in Natural Language Processing: System Demonstrations, ACL, Online, 2020, pp. 189–196.
doi:10.18653/v1/2020.emnlp-demos.25.
[30] S. Gehrmann, H. Strobelt, A. Rush, GLTR: Statistical detection and visualization of
generated text, in: Proceedings of the 57th Annual Meeting of the Association for
Computational Linguistics: System Demonstrations, ACL, Florence, Italy, 2019, pp. 111–116.
doi:10.18653/v1/P19-3019.
[31] J. Novikova, O. Dušek, A. Cercas Curry, V. Rieser, Why we need new evaluation metrics
for NLG, in: Proceedings of the Conference on Empirical Methods in Natural Language
Processing, Association for Computational Linguistics, Copenhagen, Denmark, 2017, pp.
2241–2252. doi:10.18653/v1/D17-1238.
[32] M. Martinc, S. Pollak, M. Robnik-Šikonja, Supervised and unsupervised neural approaches
to text readability, Computational Linguistics 47 (2021) 141–179. doi:10.1162/coli_a_
00398.
[33] S. Chugh, R. Rohilla, Empirical laws of natural language processing for neural language
generated text, in: Information, Communication and Computing Technology: 6th
International Conference, ICICCT, New Delhi, India, May 8, Revised Selected Papers 6, Springer,
2021, pp. 184–197. doi:10.1007/978-3-030-88378-2_15.
[34] Y. Wang, J. Deng, A. Sun, X. Meng, Perplexity from PLM Is Unreliable for Evaluating Text</p>
      <p>Quality, arXiv preprint arXiv:2210.05892 (2022).
[35] J. Read, Assessing Vocabulary, Cambridge University Press, 2000.
[36] N. Kapusta, M. Müller, M. Schauf, I. Siem, S. Dipper, Assessing the Linguistic Complexity of
German Abitur Texts from 1963–2013, in: Proceedings of the 18th Conference on Natural
Language Processing (KONVENS), 2022, pp. 48–62. URL: https://aclanthology.org/2022.
konvens-1.7.pdf.
[37] R. Flesch, How to write plain English, 1979. URL: https://web.archive.org/web/
20160712094308/http://www.mang.canterbury.ac.nz/writing_guide/writing/flesch.shtml.
[38] O. S. Goh, C. C. Fung, A. Depickere, K. W. Wong, Using Gunnnig-Fog index to assess
instant messages readability from ECAs, in: Third International Conference on Natural
Computation (ICNC), volume 5, IEEE, 2007, pp. 480–486. doi:10.1109/ICNC.2007.800.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Sarvazyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Á.</given-names>
            <surname>González</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Franco</given-names>
            <surname>Salvador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , Overview of AuTexTification at IberLEF 2023:
          <article-title>Detection and Attribution of MachineGenerated Text in Multiple Domains</article-title>
          ,
          <source>in: Procesamiento del Lenguaje Natural</source>
          , Jaén, Spain,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Montes-y</surname>
          </string-name>
          <string-name>
            <surname>Gómez</surname>
          </string-name>
          ,
          <source>Overview of IberLEF 2023: Natural Language Processing Challenges for Spanish and other Iberian Languages, Procesamiento del Lenguaje Natural</source>
          <volume>71</volume>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Pitler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nenkova</surname>
          </string-name>
          ,
          <article-title>Revisiting readability: A unified framework for predicting text quality</article-title>
          ,
          <source>in: Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>186</fpage>
          -
          <lpage>195</lpage>
          . URL: https://aclanthology.org/D08-1020.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <article-title>BERT: pre-training of deep bidirectional transformers for language understanding</article-title>
          , in: Proc.
          <article-title>Conference of the North American Chapter of the ACL: Human Language Technologies, NAACL-HLT, Minneapolis</article-title>
          , MN, USA, June 2-7, ACL,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/n19-
          <fpage>1423</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Dale</surname>
          </string-name>
          , GPT-3
          <article-title>: What's it good for?</article-title>
          ,
          <source>Natural Language Engineering</source>
          <volume>27</volume>
          (
          <year>2021</year>
          )
          <fpage>113</fpage>
          -
          <lpage>118</lpage>
          . doi:
          <volume>10</volume>
          .1017/S1351324920000601.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <article-title>GPT-3 and InstructGPT: technological dystopianism, utopianism, and "contextual" perspectives in AI ethics and industry</article-title>
          ,
          <source>AI</source>
          Ethics
          <volume>3</volume>
          (
          <year>2023</year>
          )
          <fpage>53</fpage>
          -
          <lpage>64</lpage>
          . doi:
          <volume>10</volume>
          .1007/ s43681-022-00148-6.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <article-title>An empirical evaluation of text representation schemes to filter the social media stream</article-title>
          ,
          <source>J. Exp. Theor. Artif. Intell</source>
          .
          <volume>34</volume>
          (
          <year>2022</year>
          )
          <fpage>499</fpage>
          -
          <lpage>525</lpage>
          . URL: https://doi.org/10.1080/0952813x.
          <year>2021</year>
          .
          <volume>1907792</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Elstermann</surname>
          </string-name>
          ,
          <article-title>Computer-generated text as a Posthuman mode of literature production</article-title>
          ,
          <source>Open Library of Humanities</source>
          <volume>6</volume>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .16995/olh.627.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Salvagno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Taccone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gerli</surname>
          </string-name>
          ,
          <article-title>Can Artificial Intelligence help for Scientific Writing?</article-title>
          ,
          <source>Crit Care</source>
          <volume>27</volume>
          ,
          <issue>75</issue>
          (
          <year>2023</year>
          ).
          <source>doi:10.1186/s13054-023-04380-2.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>E.</given-names>
            <surname>Felten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Raj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Seamans</surname>
          </string-name>
          ,
          <article-title>How will language modelers like chatgpt afect occupations and industries?</article-title>
          ,
          <source>arXiv preprint arXiv:2303.01157</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O.</given-names>
            <surname>Marchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Radyvonenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ignatova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Titarchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhelezniakov</surname>
          </string-name>
          ,
          <article-title>Improving text generation through introducing coherence metrics</article-title>
          ,
          <source>Cybernetics and Systems Analysis</source>
          <volume>56</volume>
          (
          <year>2020</year>
          )
          <fpage>13</fpage>
          -
          <lpage>21</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10559-020-00216-x.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P. P.</given-names>
            <surname>Ray</surname>
          </string-name>
          ,
          <article-title>ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope, Internet of Things and Cyber-Physical Systems (</article-title>
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .1016/j.iotcps.
          <year>2023</year>
          .
          <volume>04</volume>
          .003.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>