<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AI Authorship Verification: An Ensembled Approach⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Benjamin Ostrower</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacob Wessell</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abhinav Bindal</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Georgia Institute of Technology</institution>
          ,
          <addr-line>225 North Avenue, Atlanta, 30332</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <abstract>
        <p>Our method for detecting AI written text creates an ensembled method combining 3 techniques: a Fine-tuned RoBERTa, Neural Depedency Graph, and a factual coherence graph. These methods output their logits for prediction, these are then concatenated into an xgboost classifier. With advancements in technologies, it is becoming increasingly dificult to distinguish between AI and human-authored texts. This has broad societal and ethical impacts and necessitates the creation of better techniques to determine text authorship. To tackle this challenge, PAN 2024 lab [1] to be hosted at CLEF 2024 has posed a task for Generative AI authorship verification of text [ 2]. The software submissions for this task were made as easy-toreproduce docker containers on TIRA experimental platform [3]. AI text detection is a field that has grown considerably in recent years coinciding the the explosion in text generation. Other approaches have centered around teasing out patterns of probability curvature of the token outputs[4]. Training classifier heads on top of pretrained LLMs [ 5]. Testing the likelihood of a word relative to a ranked list of likely outputs of an LLM [6]. To train our models, we obtained the following public datasets from Hugging Face: • GPT-wiki-intro [7] • HF_NabeelShar • HF_artem • HF_dmitva</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Dependency Graph</kwd>
        <kwd>AI Text Detection</kwd>
        <kwd>GCN</kwd>
        <kwd>RoBERTa</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>2. Related Work</p>
    </sec>
    <sec id="sec-2">
      <title>3. Methodology</title>
      <sec id="sec-2-1">
        <title>3.1. Dataset</title>
        <p>These datasets were concatenated to create the final dataset consisting of 2.7M entries. This dataset
was partitioned into training, validation and test dataset using 80%/10%/10% split ratio.</p>
      </sec>
      <sec id="sec-2-2">
        <title>3.2. Initial Analysis</title>
        <p>We performed the following analysis on the dataset.</p>
        <p>• Word count distribution for texts with ≤ 500 words
• Number of unique words used in the text (total length ≤ 500 words)
• Number of non-stop words used in the text (total length ≤ 500 words)</p>
        <p>These investigations revealed that statistically, the texts generated by human and AI were very similar
to each other, requiring us to employ more complex strategies.</p>
      </sec>
      <sec id="sec-2-3">
        <title>3.3. Baseline Model</title>
        <p>For training purposes, we sampled 10,000 human and AI texts each from the dataset randomly. Then
we did an 80-20 split between the training and the test dataset.</p>
        <p>1. We used TfidfVectorizer from sklearn.feature_extraction.text library with
max_features = 5000 and English stop words to vectorize training and test sets.</p>
        <p>TF-IDF = TF × log

DF
Where TF is the frequency of term  in document ,  is the total number of documents, and
DF is the number of documents containing term .
2. We used PCA decomposition from sklearn.decomposition library with 100 components to
reduce the TF-IDF vectors.
3. Finally, we applied XGBClassifier from xgboost library to train our baseline model to perform
classification. We evaluated the fit of the model based on accuracy metric.</p>
        <p>We were able to achieve 87% accuracy for this baseline model.</p>
        <p>Although we were able to achieve 87% accuracy on the reduced dataset, it was observed that the
performance of this model on a completely unknown dataset was not reliable. Hence, we developed
more models for robust prediction of the authorship. This could be attributed to overfitting on the test
dataset in the previous model.</p>
      </sec>
      <sec id="sec-2-4">
        <title>3.4. Ensemble Method</title>
        <p>Our method consisted of 3 separate models ensembled together. Our three models consisted of a
ifne-tuned BERT, a graph neural network trained on the dependency graph representation of input
documents, and another graph neural network trained on the factual coherence graph of each sentence.
After pre-training each classifier, the models were ensembled together and an XGBoost model was used
as the prediction head.</p>
      </sec>
      <sec id="sec-2-5">
        <title>3.5. Fine-Tuned RoBERTa Model</title>
        <p>In this approach, the text was initially tokenized using RobertaTokenizerFast function in the
transformer library. The text was then classified using RobertaForSequenceClassification
function using weights from the pre-trained RoBERTa model. For this phase of the training, the text
was fed into the model one by one.</p>
        <p>The weights of the base RoBERTa model were fine-tuned using Low Rank Adaptation (LoRA)
technique [8]. This technique reduced the number of trainable parameters to about 1% of the total parameters
thus greatly reducing the computation resources required for training this model.</p>
        <p>Tuning was performed on the following hyperparameters of the model:
• LoRA Rank (r): {1, 2, 10, 20}
• LoRA  : {0.5, 1, 5, 10}
• Learning Rate: {1e-2, 1e-3, 1e-4}</p>
        <p>The model was trained for 10 epochs. We achieved an accuracy of 95.66% on the validation dataset.</p>
        <sec id="sec-2-5-1">
          <title>3.5.1. Dependency Parsing</title>
          <p>In natural language processing, the dependency parse of a sentence aims to visually represent the
syntactic and grammatical structure of a sentence by mapping the dependencies between words. Such
dependencies include categories such as direct objects or the subject of a verb. A directed graph naturally
represents this structure, and we felt that converting our documents into a collection of such graphs
would make structural features about the document readily available to the neural network.</p>
          <p>The first step in this pipeline was to convert raw text documents into dependency graphs. In this
step, we limited the length of the documents to 500 words and skipped any sentence with less than
3 words. The NLP library SpaCy provides functionalities for generating the dependencies for each
word in a sentence. So we simply looped over each sentence in each document and made note of the
dependencies between each word in the sentence as well as the type of dependency. These represented
the edges and edge attributes in our graph respectively. For the node features, we used Wikipedia2Vec’s
100 dimensional pre-trained embeddings to convert words into meaningful representations. Once these
graphs were created, they could be converted into the necessary format required by PyTorch-Geometric
to create our dataset. One small addition we made was to add a special edge type: root. Root connected
the root of each sentence in a document (the root is typically the main verb although not always). This
is because the dependency parse is a sentence-level structure while we are concerned with classifying
documents. Thus without our special root connection, the message-passing operations employed by
the downstream graph neural network would be unable to propagate across sentences in the same
document.</p>
          <p>With our dataset in hand, we were quickly able to train a GNN model to distinguish AI and
humanwritten documents. We experimented with standard Graph convolutional networks as well as the
Graph Attention Network and GIN models provided by PyTorch-Geometric. The best iterations of these
models achieved accuracy of .95 on a dataset consisting of examples from Wiki Intros and the Nabeel
Shar dataset. However, this performance did not translate well to the data provided for the competition.
We believe this can be attributed to our models overfitting to the training data. For example, our model
used Wikipedia2Vec as its embeddings so this naturally lends itself to performing well on the Wiki
Intros dataset. Additionally, many of the human-written examples in Nabeel Shar are very poorly
written, making distinguishing between the AI and Human generated texts very easy (when trained on
only Nabeel Shar, the model very quickly achieved nearly 100% accuracy).</p>
        </sec>
        <sec id="sec-2-5-2">
          <title>3.5.2. Factual Coherence</title>
          <p>This implementation was based of the paper by [ 9]. The method implements a graph convolutional
neural network based of the named entities in a text with a classifier head to discriminate between AI
and human text.</p>
          <p>There are 3 main extractions to be made from each document.The texts are first processed by utilizing
Spacy and NLTK to tokenize a document into sentences and parse out any Named Entities. Secondly, each
sentence is passed through a BertforNextSentencePredictionModel to obtain NextSentencePrediction
scores for each sentence in a document (this helps to model the coherencecy from one sentence to the
next and serves as a weight for how much credence to place on entities from that sentence). thirdly
the entire text (up to 400 tokens) is also passed through a RoBERTa Model to obtain the Classification
token (CLS) as a sort of semantic representation of the text itself. The entities obtained from the first
step are embedded using RoBERTa and concatenated with their corresponding wiki2vec embedding (all
0’s if that entity is not found).</p>
          <p>To create the graph of entities two conditions for creating an edge are followed: firstly if the entities
appear together in the same sentence (intra-sentence). Secondly if when creating an inter-sentence
edge their cosine similiarity between embeddings is greater than 0.9. Some named entities would
have diferent embedding lengths than others - to remedy this we take the average so that each entity
embedding is a 1x886 vector (786 for the RoBERTa, 100 for the wiki2vec).</p>
          <p>The model first uses a GCN to create graph enhanced representations of these nodes. This adjacency
matrix is first normalized by the degree matrix then passed through a variable number of convolutions
to create these new representations:</p>
          <p>(+1) =  (())</p>
          <p>These enhanced entities are then substituted into the original list they were tracked in. Each sentence
may have a variable number of entities so the entities are averaged through several linear and ReLu
layers to create an averaged entity representation in each sentence.</p>
          <p>Each document may have a dynamic number of sentences so these averaged sentence entities are
then passed through an LSTM. The LSTM output then creates a sumproduct with the NSP scores:
 =
 =0</p>
          <p>1 ∑︁  ((, ) + )

 = ∑︁ ( − 1, ) * [− 1 ,  ]</p>
          <p>=1</p>
          <p>Finally this output,D, is concatenated with the CLS token to create the final representation before
being passed through to the classifier head. Several diferent ablations were run with and without the
wikipedia2vec embeddings,number of convolutions. Optimal results were with 10 convolutions and the
wiki2vec embeddings achieving a best validation accuracy (on a combined set of equal parts human/ai
across the aforementioned datasets) of 65</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Results</title>
      <p>While our method did not obtain a submission, we did achieve some preliminary results on test datasets
provided by the competition. Our ensembled method was able to reach a 61% accuracy comprised of a
bootstrapped competition supplied dataset of PANs Human and Bard-generated text.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments References</title>
      <p>Thank you to the DS@GT CLEF team for their support. Special thanks to Anthony Miyaguchi for
putting together this team for the competition.
[3] M. Fröbe, M. Wiegmann, N. Kolyada, B. Grahm, T. Elstner, F. Loebe, M. Hagen, B. Stein, M. Potthast,
Continuous Integration for Reproducible Shared Tasks with TIRA.io, in: J. Kamps, L. Goeuriot,
F. Crestani, M. Maistro, H. Joho, B. Davis, C. Gurrin, U. Kruschwitz, A. Caputo (Eds.), Advances
in Information Retrieval. 45th European Conference on IR Research (ECIR 2023), Lecture Notes
in Computer Science, Springer, Berlin Heidelberg New York, 2023, pp. 236–241. doi:10.1007/
978-3-031-28241-6_20.
[4] E. Mitchell, Y. Lee, A. Khazatsky, C. D. Manning, C. Finn, Detectgpt: Zero-shot machine-generated
text detection using probability curvature, in: International Conference on Machine Learning,
PMLR, 2023, pp. 24950–24962.
[5] I. Solaiman, M. Brundage, J. Clark, A. Askell, A. Herbert-Voss, J. Wu, A. Radford, G. Krueger, J. W.</p>
      <p>Kim, S. Kreps, et al., Release strategies and the social impacts of language models, arXiv preprint
arXiv:1908.09203 (2019).
[6] S. Gehrmann, H. Strobelt, A. M. Rush, Gltr: Statistical detection and visualization of generated text,
arXiv preprint arXiv:1906.04043 (2019).
[7] Aaditya Bhat, Gpt-wiki-intro (revision 0e458f5), 2023. URL: https://huggingface.co/datasets/
aadityaubhat/GPT-wiki-intro. doi:10.57967/hf/0326.
[8] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, LoRA: Low-rank
adaptation of large language models, in: International Conference on Learning Representations,
2022. URL: https://openreview.net/forum?id=nZeVKeeFYf9.
[9] W. Zhong, D. Tang, Z. Xu, R. Wang, N. Duan, M. Zhou, J. Wang, J. Yin, Neural deepfake detection
with factual structure of text, arXiv preprint arXiv:2010.07475 (2020).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Babakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. B.</given-names>
            <surname>Casals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Korenčić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moskovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rizwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smirnova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stakovskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taulé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ustalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Yimam</surname>
          </string-name>
          , E. Zangerle,
          <article-title>Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dürlich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Gogoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Talman</surname>
          </string-name>
          , E. Stamatatos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Overview of the “Voight-Kampf” Generative AI Authorship Verification Task at PAN</article-title>
          and
          <article-title>ELOQUENT 2024</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          , A. G. S. de Herrera (Eds.),
          <source>Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>