<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>David versus Goliath: Can Machine Learning Detect LLM-Generated Text? A Case Study in the Detection of Phishing Emails</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Francesco Greco</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Desolda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Esposito</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Carelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Bari Aldo Moro</institution>
          ,
          <addr-line>Via E. Orabona 4, 70125 Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Large Language Models (LLMs) ofer numerous benefits, but they also pose threats, such as cybercriminals creating fake, convincing content such as phishing emails. LLMs are more convenient for criminals than handcrafting, making phishing campaigns more likely and more widespread in the future. To combat these attacks, detecting whether an email is generated by LLMs is critical. However, previous attempts have resulted in solutions that are uninterpretable and resource-intensive due to their complexity. This results in warning dialogs that do not adequately protect users. This work aims to address this problem using traditional, lightweight machine learning models that are easy to interpret and require fewer computational resources. This approach allows users to understand why an email is AI-generated, improving their decision-making in the case of phishing emails. This study has shown that logistic regression can achieve excellent performance in detecting emails generated by LLMs, while still providing the transparency needed to provide useful explanations to users.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;phishing detection</kwd>
        <kwd>explanation</kwd>
        <kwd>warning dialogs</kwd>
        <kwd>machine learning</kwd>
        <kwd>large language models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In an era marked by the proliferation of digital communication channels, phishing attacks
are a growing concern for individuals, enterprises, and organizations [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Phishing is a
cyberattack in which malicious users deceive to steal sensitive information, such as passwords,
ifnancial details, and personal data. In recent years, this attack escalated with the introduction
of Large Language Models (LLMs) designed as a ‘black hat’ alternative to traditional GPT
models, allowing hackers to automate phishing and other malicious cyber-attacks, operating
without ethical limits or restrictions. Such LLMs are highly successful since they aid attackers
in generating highly convincing, tailored, and contextually relevant text, making it even more
challenging to distinguish between legitimate content and malicious phishing attempts.
      </p>
      <p>
        State-of-the-art solutions for detecting phishing attacks relied upon rule-based systems,
blacklists, machine learning, and heuristic analysis [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. Although these approaches have been
somewhat efective in detecting phishing content, they struggle to keep up with the constant
updates and evolutions of phishing attacks. More recently, LLMs have also been used to classify
LLM-based attacks [
        <xref ref-type="bibr" rid="ref17 ref4 ref5 ref6">4, 5, 6</xref>
        ]. Despite the plethora of solutions to detect phishing content, this
attack remains very efective [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This problem is strongly related to the role of the victim in
this attack, which is often neglected in the design of defensive solutions. Indeed, when phishing
defensive systems identify a threat with a probability lower than a certain threshold (e.g., below
95%), they leave the user with a choice of what to do by showing a warning dialog. Even if
the models used to classify the content have high accuracy, non-technological aspects, such as
human factors [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], can lead users to ignore warnings. One such issue is the habituation efect
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]: when a user is repeatedly exposed to the same visual stimulus, like a phishing warning,
they may eventually start to ignore its recommendations. Warning messages often contain
technical or general information that may be dificult for users to comprehend. Research has
demonstrated the significance of creating polymorphic warning interfaces in the context of
phishing, which are interfaces that alter their appearance and/or content each time they are
displayed to the user to reduce the habituation efect [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The second issue pertains to the
absence of clear explanations. The provision of specific explanations within warnings has been
shown to support users in making informed choices and thus reduce the risk of falling victim
to phishing [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ]. The third problem is the distance between the diferent research fields
that study this attack: AI investigates classification models that perform as well as possible by
focusing on metrics such as precision and recall; on the other hand, HCI focuses on the design
of warnings and understanding of human factors, neglecting how phishing detection models
can consider such aspects, for example, how models can provide explanations and how they
can generate polymorphic content.
      </p>
      <p>
        In an attempt to fill part of the gap in the literature, this study investigates machine learning
models capable of detecting human- or LLM-generated phishing emails. Specifically, the models
we investigated in this research are conceived as post-hoc models to be used in conjunction
with already existing phishing detection systems (e.g., Google Safe Browsing), to provide a
more powerful explanation to victims. Indeed, if these post-hoc models can establish if the
email was LLM-generated, users can be warned with a more appropriate explanation, on the
assumption that explanation is crucial in defending users against this attack. The choice of
“traditional” machine learning models over LLMs or novel larger neural networks is two-faced:
larger neural networks and LLMs are black-box models, hampering their explainability that can
only be approximated using post-hoc techniques; furthermore, larger models have a significant
requirement of computational resources and have a non-negligible impact on the environment
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], making them a worse choice for improving an existing classifier over other green smaller
models.
      </p>
      <p>
        We benchmarked 8 diferent machine learning (ML) models (i.e., random forest, SVM, XGBoost,
logistic regression, K-nearest neighbors, naïve Bayes, and neural network) selected by reviewing
the literature on LLM-generated text detection [
        <xref ref-type="bibr" rid="ref14 ref15 ref16">14, 15, 16, 17, 18, 19</xref>
        ]. The ML models were
trained on a dataset comprising human-generated phishing emails [20] and LLM-generated
phishing emails, created using WormGPT LLM [21]. Additionally, we trained a neural network
on human- and LLM-generated generic text, and we then applied transfer learning by training
the models on our dataset. To empower the training process with these datasets, we meticulously
examined existing literature to identify pertinent text features utilized for distinguishing
LLMgenerated text [22, 23, 24, 25, 26], as well as text generated by artificial intelligence (AI) in general
[27, 28, 29]. A comprehensive set of 30 textual features was defined and used for encoding the
datasets before the training phase.
      </p>
      <p>The highest accuracy was obtained by the neural network without transfer learning (99.78%),
but good performances were obtained by SVM (99.20%) and logistic regression (99.03%). We also
compared the ML models, considering their ability to provide local explanations, i.e., their ability
to provide information on the feature(s) that mainly contributed to the classification of the
phishing email. Naïve Bayes model and logistic regression excel in providing local explanations,
whereas other models such as neural networks, due to their black-box nature, necessitate
supplementation with post-hoc eXplainable Artificial Intelligence (XAI) techniques like LIME
[30] and SHAP [31]. While SHAP and LIME enable the explanation of black-box models, they
inherently ofer an approximation of the true rationale behind classification decisions. Thus, a
trade-of between accuracy and quality of the explanation must be considered when choosing
the right model for this task. From our perspective, the optimal compromise lies in adopting
logistic regression.</p>
      <p>The paper is structured as follows. Section 2 reports the background and related work on
phishing detection solutions, LLMs and their use as phishing powering tools, and research in
detecting AI-generated textual content. Section 3 describes the pipeline we used to train and
test diferent ML models and their comparison. In Section 4, we discuss future works and draw
conclusions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Phishing Detection</title>
        <p>
          Phishing is a problematic threat, as it leverages human vulnerabilities to succeed [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Therefore,
to efectively ofer protection against phishing attacks, both technological and human defenses
should be put in place. Automated phishing detection is one of the main techniques to mitigate
the problem of phishing [32, 33], and it comprises all the techniques for automatically detecting
phishing content such as emails or websites. Generally, there are two main approaches to
protect users from phishing attacks with detection techniques: the phishing content can be
ifltered to not allow it to reach the user in the first place, or the user can be warned about the
threat.
        </p>
        <p>
          One of the most used techniques for filtering dangerous content is to block phishing websites
according to their presence on blacklists [34]. This approach allows to have very high precision
in the detection (low false positive rate) since blacklisted websites are almost certainly malicious.
The downside is that it takes time for the blacklists to be updated, and, therefore, a lot of
false negatives can still reach the user in the case of zero-day attacks [34]. On the other hand,
detection methods based on artificial intelligence (AI) are capable of also blocking unseen
attacks, substantially improving the recall in this task [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. However, AI-based detectors are not
100% accurate [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and can still produce false positives (i.e., genuine emails/websites classified as
phishing), which can ultimately jeopardize user productivity. Therefore, automatic filtering is
only applicable to methods that have a very low chance of producing false positives, such as
blacklists.
        </p>
        <p>
          To ensure that the user can decide about emails or websites for which the classification is
uncertain, a common approach consists of displaying a warning dialog that alerts the user about
the possible danger [35]. This can be applied, for example, to emails or websites that have been
classified by an AI detector as “phishing” with a certain probability (e.g., in the 70-95% range).
Warnings can persuade the user to steer clear of suspicious content, but commonly employed
warnings are flawed, as they often lack explanations [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The lack of explanation about the
specific danger places the burden of locating phishing cues on the user, who is often not an
expert and does not possess the knowledge to make an informed decision [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Moreover, the
lack of explanations can demotivate the user in heeding the warning and can lower the trust
in the system [36]. Another problem with traditional warnings is that they retain the same
aspect, even under diferent circumstances: this can easily produce a habituation efect in the
users, who are much more likely to ignore the warning [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. To reduce the habituation efect,
warnings should be polymorphic, i.e., change their aspect (color, shape, content, etc.) with each
interaction.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Large Language Models and LLM-powered Phishing Tools</title>
        <p>Large Language Models (LLMs) represent one of the biggest technological advances in the
ifeld of Natural Language Processing. Currently, most LLMs are based on the Transformer
architecture [37]; their staggering performance on human-like tasks [38] is mainly due to their
massive number of parameters and the vast amount of data on which they are trained, which
confers them the capability to identify subtle patterns in linguistic data and have access to
extensive knowledge on several domains. Some of the most relevant commercially available
LLMs are OpenAI’s ChatGPT [39] and all the GPT models [40], PaLM 2 [41] and Gemini [42]
by Google, Claude 2 [43] by Anthropic, and Meta’s Llama 2 [44].</p>
        <p>
          Cybercriminals did not waste time finding malicious uses of LLMs. Indeed, AI’s impressive
capabilities in creating human-like text can help fraudsters generate phishing emails that are
more efective at deceiving users; producing convincing messages using LLMs also requires
much less time and efort than crafting emails manually. LLM-generated content appears to
possess critical properties for successful phishing attacks like convincingness, consistency,
personalization, and fluency [ 45]. In a study by Hazell [46], GPT-3.5 and GPT-4 were used
to produce spear-phishing emails directed to 600 British Members of Parliament, including
collecting publicly available information; results showed that LLMs could considerably facilitate
the conduction and scaling of spear-phishing attacks. A study by Sha [47] showed that
GPT3-generated phishing emails were less efective overall than human-crafted phishing emails.
However, Heiding et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] demonstrated that GPT-4 can generate the most efective phishing
attacks when humans refine the emails produced by the model. This work shows that phishing
campaigns powered by advanced LLMs like GPT-4 would be extremely advantageous for
criminals, even if conducted in a completely automatic manner.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Detecting LLM-Generated Text</title>
        <p>A first step towards the mitigation of phishing campaigns powered by LLMs is to detect whether
an email is LLM-generated. Various eforts have been made in the literature toward this research
direction, even though detecting AI-generated text without knowing the method used for the
generation still remains very tricky. There are various types of detectors for AI-generated text
[48], but the most investigated category includes language models that are fine-tuned for the
task.</p>
        <p>These detectors are binary classifiers trained to discriminate between AI and human-generated
content [49]. In 2019, OpenAI published GPT-2PD [50], a model based on RoBERTa [51] for
detecting content produced by GPT-2 1.5B with an accuracy of ~95%. OpenAI then published
a model for detecting generic AI-written text [52], but shut it down briefly after, as it had a
very low performance in terms of recall (26%); nonetheless, this model was even described as
significantly more reliable than the old GPT-2 detector [ 53]. Zellers et al. [54] proposed Grover,
a transformer-based model for news generation, which is also used to detect AI-generated text.
Using Grover itself to discriminate texts generated by Grover was indeed the most efective
approach (~90% detection accuracy). GLTR [55] is a detector that uses both BERT and GPT-2
117M for detecting AI-generated text and ofering users visual support to assist them in forensic
analysis; the model itself achieved an AUC of about ~0.86, and it resulted to be efective in
improving the user’s performance in detecting AI-generated text (from 54% to 72%). Adelani
et al. [56] compared Grover [54], GTLR [55], and GPT-2PD [50] on the detection of product
reviews generated by GPT-2 fine-tuned on Amazon product reviews; the GPT-2 detector was
the best at discriminating text generated by the GPT-2 model.</p>
        <p>Fagni et al. [57] fine-tuned a RoBERTa-based model to detect AI-generated tweets in a dataset
of deepfake tweets, obtaining an F1-score=0.896, outperforming both traditional ML models (e.g.,
bag-of-words) and complex neural network models (e.g., RNN, CNN) by a large margin. Uchendu
et al. [58] employed a RoBERTa-based approach, which outperformed baseline detectors in
spotting news articles generated by several TGMs (F1-score between ~0.85 and ~0.92). Finally,
Mitrovic et al. [59] fine-tuned a DistilBERT model and used it to detect ChatGPT-generated text,
obtaining excellent performance in a standard setting (accuracy=0.98), and decent performance
in an adversarial setting (accuracy=0.79). Moreover, SHAP (Shapley Additive exPlanations) [31]
was used to provide local explanations for specific decisions in the form of highlighting input
text tokens.</p>
        <p>DetectGPT [49] pertains to a diferent category of detectors, as it is not fine-tuned on any
data for detecting LLM-generated content; in fact, it is a zero-shot detector, which uses diferent
statistical signatures of AI-generated content to perform the detection. DetectGPT achieved,
on average, ~0.95 AUROC in detecting content that was generated by diferent LLMs, across
diferent datasets.</p>
        <p>Watermarking is yet another technique used for detecting LLM-generated text. These
detectors embed imperceptible signals in the generated medium itself so that they can later be
detected eficiently [ 60]. An example of such detectors was presented by Kirchenbauer et al.
[61].</p>
        <p>All the mentioned detectors have the problem of being vulnerable to paraphrasing attacks,
since also a light paraphraser can severely afect the reliability of the models [ 62]. Krishna et al.
[63] proposed a retrieval-based detector, which seems to partially mitigate this vulnerability.
This approach searches a database containing sequences of text previously generated by an LLM
to detect LLM-generated content. The proposed algorithm looks for sequences that match the
input text within a certain threshold. The authors empirically tested the tool using a database of
15M generations from a fine-tuned T5-XXL model, finding that it was able to detect 80% to 97%
of paraphrased generations across diferent settings while only classifying 1% of human-written
sequences as AI-generated.</p>
        <p>
          Another more traditional approach regards applying machine learning techniques for
detecting AI-generated text. This involves using linguistic features extracted from the text such
as TF-IDF (Term Frequency – Inverse Document Frequency) and bag-of-words [64] features
(e.g., [57]), but also features like readability and understandability indexes (e.g., [58]). Various
works address the problem with traditional ML models, including Naïve Bayes [65], SVM [19],
Random Forest [17], XGBoost [18], multi-layer perceptron [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], and K-Nearest Neighbors [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
In May 2019, OpenAI released a simple detector based on logistic regression that uses TF-IDF
unigram and bigram features [66] that was able to detect GPT-2-generated content with an
accuracy between 74% and 97% [50].
        </p>
        <p>
          However, the huge number of parameters in LLMs (and other larger neural network-based
techniques) requires a vast usage of computational resources for both training and usage. As
they become more and more complex and widespread, their energy consumption and, thus, their
carbon footprint become non-negligible [67]. Green AI [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] is a new field investigating how
AI can be more environmentally friendly and inclusive. Lightweight models, e.g., traditional
machine learning models such as random forests or shallow neural networks, can, therefore,
be considered “green models” as they are a much more sustainable choice in terms of energy
consumption.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Detecting Phishing Attacks Generated by LLMs</title>
      <p>As a small step towards a polymorphic explainable model for phishing detection, we focus on
detecting the author (i.e., humans or LLMs) of phishing emails using green AI models. The
following section delves into the machine learning aspects of our work, providing details into the
generation of the dataset, the training procedure, and the final results, providing a comparison
among all tested machine learning models.</p>
      <sec id="sec-3-1">
        <title>3.1. Materials</title>
        <p>An appropriate dataset is needed to train machine learning models to discriminate between
human-generated phishing emails and LLM-generated ones. With this goal in mind, we accessed
a curated collection of human-generated phishing emails [20], selecting the most recent 1000
emails from the “Nazario” and “Nigerian Fraud” collections. To complete the dataset, we
generated 1000 additional emails using an LLM. We adopted WormGPT, a version of ChatGPT
ifne-tuned to comply with malicious requests [ 21]. To generate the emails, the following prompt
was used:</p>
        <p>Pretend to be a hacker planning a phishing campaign. Generate 5 very detailed
phishing emails, about [topic] using Cialdini’s principle of [principle]. You have to
use fake American real names for the sender and recipient (example: John Smith).</p>
        <p>Invent a phishing link URL for each email (example: https:// refund-claim-link.com).</p>
        <p>In the prompt, two variables have been introduced to increase the variability of the email
content. The “topic” variable determines the main message of the phishing email. The topics
that were selected and used for the generation are common topics for phishing emails: (i) Urgent
Account Verification, (ii) Financial Transaction, (iii) Prize or Lottery Winning, (iv) Fake Invoice
or Payment Request, (v) Charity Scam, (vi) Account Security, (vii) Tax Refund Scam, (viii) Job
Ofer, (ix) Social Media Notification, (x) COVID-19 Related Scam, (xi) Law Breaking Activity,
(xii) War-Related Aid, and (xiii) Other random topics. The “principle” variable, instead, refers to
Cialdini’s six principles of persuasion [68], typically used in phishing emails to persuade users
to perform malicious and dangerous actions. The values used in the prompts for the Cialdini
principles were: (i) Reciprocity, (ii) Consistency, (iii) Social Proof, (iv) Authority, (v) Liking, (vi)
Scarcity, and (vii) No principle.</p>
        <p>The final dataset instances are labeled as either positive for LLM-generated content or negative
for human-generated content. The dataset of raw emails is publicly available in a Kaggle dataset1.
Since, as it will be better described in Section 3.2, we focus on machine learning models, we
further process the dataset to extract features for the training phase. Referring to the literature,
we extracted a total of 30 features [29]. Details on the features are available in the appendix
(Table 2).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Methods</title>
        <p>
          The overarching goal of our eforts is to provide an explainable green model for the
discrimination of human-generated and LLM-generated phishing emails. For this reason, we used smaller
classical machine learning models based on features rather than LLMs. To choose the best
models for this task, we first analyzed the available literature. Most of the similar works focus
on the following models: random forests [17], Support Vector Machines (SVM) [19], XGBoosting
[18], Logistic Regression [66], K-Nearest Neighbors (KNN) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], Naïve Bayes [65], and Neural
Network [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. To further expand the models’ list, we also pre-trained the Neural Network on a
dataset of various (not necessarily in the phishing context) emails written by either humans or
LLM, and then fine-tuned it using our dataset.
        </p>
        <p>
          Although these models are not always fully explainable by default, they are smaller and
require fewer resources than bigger neural networks or transformer-based models [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. To
ensure consistent results, all models were implemented using Python, and the training phase
was executed on a single machine. Furthermore, all models underwent a hyper-parameter
selection phase to maximize the performances of each model for their comparison. The final
parameters are available in the appendix (Table 3).
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Experimental Results</title>
        <p>The training phase for each model was executed on a single machine powered by a
13thgeneration Intel i7 processor and equipped with 16 GB of RAM. For these experiments, the
use of a GPU was not required. To evaluate the proposed methods, we employed a stratified
repeated 10-fold cross-validation. In other words, each fold contained roughly the same amount
of positive and negative instances. For the neural network, we used the binary-cross entropy
loss function, defined as:
1https://www.kaggle.com/datasets/francescogreco97/human-llm-generated-phishing-legitimate-emails
(, ) = − ( log  + (1 − ) log(1 − ))
(1)
where  is the ground truth label, while  is the model output for an individual observation.
Cross-entropy was minimized using the Adam optimizer and a fixed learning rate (whose value
was optimized in the hyper-parameter selection phase).</p>
        <p>We computed the accuracy as the performance metric, defined as the proportion of correctly
classified instances (both true positive and true negatives) in the selected sample. Table 1
shows the average results of the repeated stratified 10-fold cross-validation for each model. The
distribution of the accuracy is better represented in Figure 1.</p>
        <p>By analyzing the results reported in Table 1, we can see that the Neural Network model
seems to be the best-performing model, although the gain in accuracy is only 0.58% over the
second-best model, SVMs. Among the better-performing models is logistic regression, which
has an accuracy of 99.03%.</p>
        <p>To better analyze the diferences in performances of the various models, we performed a
paired -test for each model pair. The statistical test aims to understand whether one can reject
the null hypotheses of the diferences in the means being due only to chance. If the p-value
was found to be less than 0.05, we calculated the efect size using Cohen’s  score [69]. Since
this score ranges from 0 to 1, to facilitate its interpretation, we categorized the efect sizes into
three distinct levels: insignificant for values below 0.2, low for values ranging from 0.2 to 0.5,
medium for values between 0.51 and 0.8, and high for values between 0.81 and 1. To facilitate
the analysis of all these comparisons, in Figure 2 we depicted a matrix where each cell reports
the -value resulting from the comparison between the model specified in the related column
and the model specified in the related cell. Furthermore, each cell is color-coded to represent
the Cohen’s  level: orange for high, yellow for medium, and green for low levels of efect size,
while the cell has no color in case of insignificant values. In Figure 2 we can see that almost all
diferences in model accuracies are statistically relevant, except for the diference between KNN
and XGBoost and the one between Logistic Regression and Neural Network (with Transfer
Learning).</p>
        <p>
          While the models investigated in our study demonstrate high performance, comparable even
to the less interpretable and less green LLMs [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], it remains paramount to provide users with
explanations regarding the malicious nature of content to defend against phishing attacks. For
these reasons, our study also focuses on informing users whether an email originates from an
AI source or not, detailing which aspect or feature of the text triggered suspicion, leading the
defense system to classify it as human-written or AI-written. In line with other studies [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], the
ifnal goal is to warn users about phishing attacks with a warning dialog, as the one reported in
Figure 3, which includes a message explaining that the email opened may have been generated
by an AI.
        </p>
        <p>Technically, this entails the ML model providing a local explanation, pinpointing the most
influential feature among the 30 considered in the classification process for the classified email.
Therefore, determining the best model for this task necessitates an analysis of each model’s
explanation capabilities. While models like logistic regression and K-nearest neighbors (KNN)
are inherently explainable, the other models considered in this study require post-hoc models
such as LIME or SHAP to provide explanations; however, in the case of the black-box models,
the selected feature is an approximation of the one selected by the model, thus can be wrong and
thus less efective in the explanation phase. Given logistic regression’s innate ability to provide
transparent explanations, together with its exceptional classification performance demonstrated
in this study - virtually on a par with the best-performing neural networks - we argue that
logistic regression is the most appropriate choice for detecting emails generated by LLMs, while
providing essential explanations to users.
2.3e-40 4.5e-36
1e-25</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>In this study, we analyzed diferent ML models for classifying emails as written by a human or
using an LLM in the context of phishing. Detecting AI-generated emails can help mitigate the
threat of phishing campaigns powered by LLMs, as these tools can produce convincing phishing
emails in a fraction of the time it would otherwise take cybercriminals to create them manually.
Therefore, we analyzed diferent ML models, which can be trained and used with less impact
on the environment compared to LLMs.</p>
      <p>
        Our experiments yielded interesting results: ML models were able to achieve accuracies
above 90% in the task of classifying the author of the emails, in line with other works in the
literature in other application domains (e.g., [54, 59, 49]). Considering the statistical relevance
of the diferences, the three best models (with an accuracy of over 99%) resulted to be Neural
Networks, SVMs, and Logistic Regression (or Neural Networks with transfer learning). Although
statistically relevant, the diferences between the performances of the three models were only
0.58% and 0.17%, respectively. However, neural networks are heavier to compute [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and are
dificult to explain [ 70]. Similarly, SVMs may also be dificult to interpret [ 71]. On the other
hand, Logistic Regression is a simple white-box model and provides an easy way to interpret
their results [72]. Having a transparent model allows us to interpret the model in terms of
which features were more or less important in classifying a particular email as LLM-generated
or not. This allows not only to use warning dialogs to warn the user when an email is classified
as generated by an LLM, but also to provide an explanation. Explanations have the advantage of
increasing the user’s motivation to heed the warning dialog and their trust in the system [36].
Furthermore, using warning dialogs with explanations that change depending on the specific
context, enhances the efectiveness of the warnings, as they reduce the user’s habituation to
seeing the same warning under diferent circumstances [
        <xref ref-type="bibr" rid="ref9">73, 9</xref>
        ]. However, to obtain warnings
with these benefits, users must first understand the reported explanations [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]; therefore, if
the explanations are based on reporting which features were most relevant in the ML model’s
decision, we must be able to efectively describe to the user what those features are. This means
that not every feature of our feature set is adequate for constituting a good explanation for a
naive user, as it may be overly technical.
      </p>
      <p>Several future works are planned to extend and improve this research. First, we want to
explore multi-class models that can detect phishing emails in general and determine whether
the text is human-generated or not; unlike the post-hoc approaches proposed in this paper,
multi-class models can be used as a stand-alone solution, useful in a scenario where post-hoc is
not suficient. Second, a user study is needed to determine which of the 30 features identified
in our research can be explained to users, even without technical knowledge. Third, we aim
to benchmark our ML models in an adversarial setting, i.e., using paraphrasing attacks that
introduce slight modifications in the LLM-generated emails, as even a light paraphraser can
drastically decrease the efectiveness of detector tools [ 62]. Furthermore, it is also possible to
extend the dataset, understanding whether the additional features, alongside others, can be
used to detect phishing emails and their author using green and explainable machine learning
models. Future studies may investigate if the slight loss of accuracy of simpler white-box models
impacts the usefulness of the classifier through user studies and the inclusion of additional AI
metrics (e.g., F1-score, precision, and recall). Finally, end-user development techniques will be
explored to support the adaptation of the AI model and user interface to diferent contexts, with
the aim of making the overall solution more tailored to specific needs [74, 75, 76].</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work is partially supported by the Italian Ministry of University and Research (MUR) under
grant PRIN 2022 PNRR “DAMOCLES: Detection And Mitigation Of Cyber attacks that exploit
human vuLnerabilitiES” – CUP: H53D23008140001.</p>
      <p>This work is partially supported by the co-funding of the European Union - Next
Generation EU: NRRP Initiative, Mission 4, Component 2, Investment 1.3 – Partnerships extended
to universities, research centres, companies and research D.D. MUR n. 341 del 5.03.2022 –
Next Generation EU (PE0000014 – “Security and Rights In the CyberSpace – SERICS” - CUP:
H93C22000620001).</p>
      <p>The research of Francesco Greco is funded by a PhD fellowship within the framework of the
Italian “D.M. n. 352, April 9, 2022” - under the National Recovery and Resilience Plan, Mission 4,
Component 2, Investment 3.3 - PhD Project “Investigating XAI techniques to help user defend
from phishing attacks”, co-supported by “Auriga S.p.A.” (CUP H91I22000410007).</p>
      <p>The research of Andrea Esposito is funded by a Ph.D. fellowship within the framework of the
Italian “D.M. n. 352, April 9, 2022” - under the National Recovery and Resilience Plan, Mission
4, Component 2, Investment 3.3 - Ph.D. Project “Human-Centered Artificial Intelligence (HCAI)
techniques for supporting end users interacting with AI systems”, co-supported by “Eusoft S.r.l.”
(CUP H91I22000410007).
[17] A. Sharma, A. Nandan, R. Ralhan, An investigation of supervised learning methods for
authorship attribution in short hinglish texts using char &amp; word n-grams, 2018. URL:
http://arxiv.org/abs/1812.10281. doi:arXiv:1812.10281.
[18] R. Shijaku, E. Canhasi, Chatgpt generated text detection, 2023. URL: http://dx.doi.org/10.</p>
      <p>13140/RG.2.2.21317.52960. doi:10.13140/RG.2.2.21317.52960.
[19] T. Solorio, S. Pillay, S. Raghavan, M. Montes y Gómez, Modality specific meta features
for authorship attribution in web forum posts, in: International Joint Conference on
Natural Language Processing, Asian Federation of Natural Language Processing, Chiang
Mai, Thailand, 2011, pp. 156–164. URL: https://aclanthology.org/I11-1018.
[20] Anonymous, Phishing email curated dataset, 2023. URL: https://zenodo.org/records/
8339691. doi:10.5281/zenodo.8339691.
[21] Forsasuke, Wormgpt, 2023. URL: https://flowgpt.com/p/wormgpt-6.
[22] L. Fröhling, A. Zubiaga, Feature-based detection of automated language models: tackling
gpt-2, gpt-3 and grover, PeerJ Computer Science 7 (2021) 23. URL: https://doi.org/10.7717/
peerj-cs.443. doi:10.7717/peerj-cs.443.
[23] P. Jwalapuram, S. Joty, X. Lin, Rethinking self-supervision objectives for generalizable
coherence modeling, in: Annual Meeting of the Association for Computational Linguistics,
volume 1, Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 6044–6059.
URL: https://doi.org/10.18653/v1/2022.acl-long.418. doi:10.18653/v1/2022.acl-long.
418.
[24] Y. Ma, J. Liu, F. Yi, Q. Cheng, Y. Huang, W. Lu, X. Liu, Ai vs. human – diferentiation analysis
of scientific content generation, 2023. URL: http://arxiv.org/abs/2301.10416. doi: arXiv:
2301.10416.
[25] A. Muñoz-Ortiz, C. Gómez-Rodríguez, D. Vilares, Contrasting linguistic patterns in
human and llm-generated text, 2023. URL: http://arxiv.org/abs/2308.09067. doi:arXiv:
2308.09067.
[26] T. T. Nguyen, A. Hatua, A. H. Sung, How to detect ai-generated texts?, in: Annual
Ubiquitous Computing, Electronics &amp; Mobile Communication Conference, IEEE, New
York, USA, 2023, pp. 464–471. URL: https://ieeexplore.ieee.org/document/10316132. doi:10.
1109/UEMCON59035.2023.10316132.
[27] R. Barzilay, M. Lapata, Modeling local coherence: An entity-based approach,
Computational Linguistics 34 (2008) 1–34. URL: https://doi.org/10.1162/coli.2008.34.1.1. doi:10.
1162/coli.2008.34.1.1.
[28] D. Kosmajac, V. Keselj, Twitter bot detection using diversity measures, in: International
Conference on Natural Language and Speech Processing, Association for Computational
Linguistics, Trento, Italy, 2019, pp. 1–8. URL: https://aclanthology.org/W19-7401.
[29] S. T. Piantadosi, Zipf’s word frequency law in natural language: A critical review and
future directions, Psychonomic Bulletin &amp; Review 21 (2014) 1112–1130. URL: https://doi.
org/10.3758/s13423-014-0585-6. doi:10.3758/s13423-014-0585-6.
[30] M. T. Ribeiro, S. Singh, C. Guestrin, "why should i trust you?": Explaining the predictions
of any classifier, 2016. URL: http://arxiv.org/abs/1602.04938. doi: arXiv:1602.04938.
[31] S. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, 2017. URL:
http://arxiv.org/abs/1705.07874. doi:arXiv:1705.07874v2.
[32] P. Kumaraguru, S. Sheng, A. Acquisti, L. F. Cranor, J. Hong, Teaching johnny not to
fall for phish, ACM Transactions on Internet Technology 10 (2010) 1–31. URL: https:
//doi.org/10.1145/1754393.1754396. doi:10.1145/1754393.1754396.
[33] G. Varshney, M. Misra, P. K. Atrey, A survey and classification of web phishing
detection schemes, Security and Communication Networks 9 (2016) 6266–6284. URL:
https://onlinelibrary.wiley.com/doi/abs/10.1002/sec.1674. doi:10.1002/sec.1674.
[34] S. Sheng, B. Wardman, G. Warner, L. Cranor, J. Hong, C. Zhang, An empirical analysis of
phishing blacklists, in: International Conference on Email and Anti-Spam, Mountain View,
California, USA, 2009. URL: https://kilthub.cmu.edu/articles/journal_contribution/An_
Empirical_Analysis_of_Phishing_Blacklists/6469805/1. doi:10.1184/R1/6469805.V1.
[35] J. Petelka, Y. Zou, F. Schaub, Put your warning where your link is: Improving and evaluating
email phishing warnings, in: CHI Conference on Human Factors in Computing Systems,
ACM, Glasgow Scotland UK, 2019, pp. 1–15. URL: https://doi.org/10.1145/3290605.3300748.
doi:10.1145/3290605.3300748.
[36] G. Vilone, L. Longo, Notions of explainability and evaluation approaches for explainable
artificial intelligence, Information Fusion 76 (2021) 89–106. URL: https://doi.org/10.1016/j.
infus.2021.05.009. doi: 10.1016/j.inffus.2021.05.009.
[37] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I.
Polosukhin, Attention is all you need, 2017. URL: https://doi.org/10.48550/arXiv.1706.03762.
doi:10.48550/arXiv.1706.03762.
[38] HuggingFace, Open llm leaderboard, 2024. URL: https://huggingface.co/spaces/</p>
      <p>HuggingFaceH4/open_llm_leaderboard.
[39] OpenAI, Introducing chatgpt, 2022. URL: https://openai.com/blog/chatgpt.
[40] OpenAI, Gpt-4 and gpt-4 turbo, 2023. URL: https://platform.openai.com/docs/models/
gpt-4-and-gpt-4-turbo.
[41] Z. Ghahramani, Introducing palm 2 (2023). URL: https://blog.google/technology/ai/
google-palm-2-ai-large-language-model/.
[42] G. DeepMind, Gemini, 2024. URL: https://deepmind.google/technologies/gemini.
[43] Anthropic, Claude 2 (2023). URL: https://www.anthropic.com/news/claude-2.
[44] Meta, Llama 2, 2023. URL: https://llama.meta.com/.
[45] D. Kang, X. Li, I. Stoica, C. Guestrin, M. Zaharia, T. Hashimoto, Exploiting programmatic
behavior of llms: Dual-use through standard security attacks, 2023. URL: https://arxiv.org/
abs/2302.05733. doi:arXiv:2302.05733.
[46] J. Hazell, Spear phishing with large language models, 2023. URL: https://arxiv.org/abs/
2305.06972. doi:arXiv:2305.06972.
[47] How well does GPT phish people? An investigation involving cognitive biases and
feedback, IEEE, 2023. URL: https://ieeexplore.ieee.org/document/10190709. doi:10.1109/
EuroSPW59978.2023.00055.
[48] C. Barrett, B. Boyd, E. Bursztein, N. Carlini, B. Chen, J. Choi, A. R. Chowdhury,
M. Christodorescu, A. Datta, S. Feizi, K. Fisher, T. Hashimoto, D. Hendrycks, S. Jha,
D. Kang, F. Kerschbaum, E. Mitchell, J. Mitchell, Z. Ramzan, K. Shams, D. Song, A. Taly,
D. Yang, Identifying and mitigating the security risks of generative ai, Foundations and
Trends®in Privacy and Security 6 (2023) 1–52. URL: http://dx.doi.org/10.1561/3300000041.
doi:10.1561/3300000041.
[49] E. Mitchell, Y. Lee, A. Khazatsky, C. D. Manning, C. Finn, Detectgpt: Zero-shot
machinegenerated text detection using probability curvature, 2023. URL: http://arxiv.org/abs/2301.
11305. doi:arXiv:2301.11305.
[50] I. Solaiman, M. Brundage, J. Clark, A. Askell, A. Herbert-Voss, J. Wu, A. Radford, G. Krueger,
J. W. Kim, S. Kreps, M. McCain, A. Newhouse, J. Blazakis, K. McGufie, J. Wang, Release
strategies and the social impacts of language models, 2019. URL: http://arxiv.org/abs/1908.
09203. doi:arXiv:1908.09203.
[51] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer,
V. Stoyanov, Roberta: A robustly optimized bert pretraining approach, 2019. URL: http:
//arxiv.org/abs/1907.11692. doi:arXiv:1907.11692.
[52] OpenAI, New ai classifier for indicating ai-written text, 2023. URL: https://openai.com/
blog/new-ai-classifier-for-indicating-ai-written-text.
[53] OpenAI, Gpt-2 output detector, 2019. URL: https://github.com/openai/
gpt-2-output-dataset/tree/master/detector.
[54] R. Zellers, A. Holtzman, H. Rashkin, Y. Bisk, A. Farhadi, F. Roesner, Y. Choi, Defending
against neural fake news, 2020. URL: http://arxiv.org/abs/1905.12616. doi:arXiv:1905.
12616v3.
[55] S. Gehrmann, H. Strobelt, A. M. Rush, Gltr: Statistical detection and visualization of
generated text, 2019. URL: http://arxiv.org/abs/1906.04043. doi:arXiv:1906.04043.
[56] D. I. Adelani, H. Mai, F. Fang, H. H. Nguyen, J. Yamagishi, I. Echizen, Generating
sentimentpreserving fake online reviews using neural language models and their human- and
machine-based detection, 2019. URL: http://arxiv.org/abs/1907.09177. doi:arXiv:1907.
09177.
[57] T. Fagni, F. Falchi, M. Gambini, A. Martella, M. Tesconi, Tweepfake: About detecting
deepfake tweets, PLOS ONE 16 (2021) e0251415. URL: https://doi.org/10.1371/journal.pone.
0251415. doi:10.1371/journal.pone.0251415.
[58] A. Uchendu, T. Le, K. Shu, D. Lee, Authorship attribution for neural text generation,
in: Conference on Empirical Methods in Natural Language Processing, Association for
Computational Linguistics, Online, 2020, pp. 8384–8395. URL: https://aclanthology.org/
2020.emnlp-main.673. doi:10.18653/v1/2020.emnlp-main.673.
[59] S. Mitrović, D. Andreoletti, O. Ayoub, Chatgpt or human? detect and explain. explaining
decisions of machine learning model for detecting short chatgpt-generated text, 2023. URL:
http://arxiv.org/abs/2301.13852. doi:arXiv:2301.13852.
[60] I. S. Moskowitz (Ed.), Natural Language Watermarking: Design, Analysis, and a
Proofof-Concept Implementation, volume LNCS, volume 2137 of Information Hiding, Springer
Berlin Heidelberg, Berlin, Heidelberg, 2001. URL: https://link.springer.com/chapter/10.
1007/3-540-45496-9_14.
[61] J. Kirchenbauer, J. Geiping, Y. Wen, J. Katz, I. Miers, T. Goldstein, A watermark for large
language models, 2023. URL: http://arxiv.org/abs/2301.10226. doi:arXiv:2301.10226v3.
[62] V. S. Sadasivan, A. Kumar, S. Balasubramanian, W. Wang, S. Feizi, Can ai-generated text be
reliably detected?, 2023. URL: http://arxiv.org/abs/2303.11156. doi:arXiv:2303.11156v2.
[63] K. Krishna, Y. Song, M. Karpinska, J. Wieting, M. Iyyer, Paraphrasing evades detectors of
ai-generated text, but retrieval is an efective defense, 2023. URL: http://arxiv.org/abs/2303.
13408. doi:arXiv:2303.13408.
[64] F. Sebastiani, Machine learning in automated text categorization, ACM Computing
Surveys 34 (2002) 1–47. URL: https://doi.org/10.1145/505282.505283. doi:10.1145/505282.
505283.
[65] F. Howedi, M. Masnizah, Text classification for authorship attribution using naive
bayes classifier with limited training data, Computer Engineering and Intelligent
Systems 5 (2014). doi:https://iiste.org/Journals/index.php/CEIS/article/
view/12132/12484.
[66] OpenAI, Logistic regression gpt-2 detector, 2019. URL: https://github.com/openai/
gpt-2-output-dataset/blob/master/baseline.py.
[67] R. Verdecchia, J. Sallou, L. Cruz, A systematic review of green ai, WIREs Data Mining and
Knowledge Discovery 13 (2023) 26. URL: https://wires.onlinelibrary.wiley.com/doi/abs/10.
1002/widm.1507. doi:10.1002/widm.1507.
[68] R. B. Cialdini, Influence: The Psychology of Persuasion, Collins Business Essentials, revised
ed., Harper Collins, 2009.
[69] J. Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd edition ed.,
Routledge, New York, USA, 1988. URL: https://doi.org/10.4324/9780203771587. doi:10.4324/
9780203771587.
[70] O. Loyola-González, Black-box vs. white-box: Understanding their advantages and
weaknesses from a practical point of view, IEEE Access 7 (2019) 154096–154113. URL:
https://ieeexplore.ieee.org/document/8882211. doi:10.1109/ACCESS.2019.2949286.
[71] A. Navia-Vázquez, E. Parrado-Hernández, Support vector machine interpretation,
Neurocomputing 69 (2006) 1754–1759. URL: https://www.sciencedirect.com/science/article/pii/
S0925231205004480. doi:10.1016/j.neucom.2005.12.118.
[72] S. Meacham, G. Isaac, D. Nauck, B. Virginas, Towards explainable ai: Design and
development for explanation of machine learning predictions for a patient readmittance
medical application, in: K. Arai, R. Bhatia, S. Kapoor (Eds.), Intelligent Computing,
volume 997, Springer, Cham, London, UK, 2019, pp. 939–955. URL: https://doi.org/10.1007/
978-3-030-22871-2_67. doi:10.1007/978-3-030-22871-2{\_}67.
[73] F. Greco, G. Desolda, A. Esposito, Explaining phishing attacks: An xai approach to enhance
user awareness and trust, in: F. Buccafurri, E. Ferrari, G. Lax (Eds.), The Italian Conference
on CyberSecurity, volume 3488, CEUR-WS, Bari, Italy, 2023. URL: https://ceur-ws.org/
Vol-3488/paper22.pdf.
[74] C. Ardito, P. Bottoni, M. F. Costabile, G. Desolda, M. Matera, A. Piccinno, M.
Picozzi, Enabling end users to create, annotate and share personal information
spaces, Lecture Notes in Computer Science (including subseries Lecture Notes in
Artificial Intelligence and Lecture Notes in Bioinformatics) 7897 LNCS (2013) 40 – 55.
URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-84884378360&amp;doi=10.1007%
2f978-3-642-38706-7_5&amp;partnerID=40&amp;md5=ac9ba219ee101062200d61f268479daa. doi:10.
1007/978-3-642-38706-7_5.
[75] G. Desolda, Enhancing workspace composition by exploiting linked open data as a
polymorphic data source, Smart Innovation, Systems and Technologies 40 (2015) 97 – 108.
URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-84947913933&amp;doi=10.1007%
2f978-3-319-19830-9_9&amp;partnerID=40&amp;md5=2e4d49da34406b062da3f5f310e3b922. doi:10.
1007/978-3-319-19830-9_9.
[76] C. Ardito, M. F. Costabile, G. Desolda, M. Latzina, M. Matera, Making mashups actionable
through elastic design principles, Lecture Notes in Computer Science (including subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9083 (2015)
236 – 241. doi:10.1007/978-3-319-18425-8_22.
[77] G. Jawahar, M. Abdul-Mageed, L. Lakshmanan, V. S., Automatic detection of machine
generated text: A critical survey, in: International Conference on Computational
Linguistics, International Committee on Computational Linguistics, Barcelona, Spain (Online),
2020, pp. 2296–2309. URL: https://aclanthology.org/2020.coling-main.208https://doi.org/10.
18653/v1/2020.coling-main.208. doi:10.18653/v1/2020.coling-main.208.</p>
    </sec>
    <sec id="sec-6">
      <title>A. Appendix</title>
      <p>average_word_length
pos_tag_frequency
uppercase_frequency
average_sentence_length
function_words_frequency
flesch_reading_ease
type_token_ratio
dependency_types
emotions
named_entity_count
common_words
stop_words
bigram
trigram
lack_of_purpose
word_distribution_zipf_law_slope
word_distribution_zipf_law_r_squared
word_distribution_zipf_law_cost
consistency_phrasal_verbs
text_diversity_yulek
text_diversity_simpsond
text_diversity_honorer
text_diversity_sichels
coherence_1
coherence_2
constituent_lengths
constituent_types
coreference_resolution
lexical_diversity
Reference paper(s)
Model
SVM
XGBoosting
Parameters
ranT dense_3_input input: [(None,</p>
      <p>InputLayer output: [(None,</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>IBM</given-names>
            ,
            <surname>Security</surname>
          </string-name>
          x-force
          <source>threat intelligence index</source>
          ,
          <year>2023</year>
          . URL: https://www.ibm.com/reports/ threat-intelligence.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Almomani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. B.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Atawneh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Meulenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Almomani</surname>
          </string-name>
          ,
          <article-title>A survey of phishing email filtering techniques</article-title>
          ,
          <source>IEEE Communications Surveys &amp; Tutorials</source>
          <volume>15</volume>
          (
          <year>2013</year>
          )
          <fpage>2070</fpage>
          -
          <lpage>2090</lpage>
          . URL: https://ieeexplore.ieee.org/document/6489877. doi:
          <volume>10</volume>
          .1109/SURV.
          <year>2013</year>
          .
          <volume>030713</volume>
          . 00020.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Khonji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Iraqi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <article-title>Phishing detection: A literature survey</article-title>
          ,
          <source>IEEE Communications Surveys &amp; Tutorials</source>
          <volume>15</volume>
          (
          <year>2013</year>
          )
          <fpage>2091</fpage>
          -
          <lpage>2121</lpage>
          . URL: https://ieeexplore.ieee.org/document/ 6497928. doi:
          <volume>10</volume>
          .1109/SURV.
          <year>2013</year>
          .
          <volume>032213</volume>
          .00009.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Heiding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schneier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vishwanath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <article-title>Devising and detecting phishing: Large language models vs</article-title>
          .
          <source>smaller human models</source>
          ,
          <year>2023</year>
          . URL: https://doi.org/10. 48550/arXiv.2308.12287. doi:arXiv:
          <fpage>2308</fpage>
          .
          <fpage>12287</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Koide</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fukushi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Nakano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chiba</surname>
          </string-name>
          ,
          <article-title>Detecting phishing sites using chatgpt, 2023</article-title>
          . URL: https://arxiv.org/abs/2306.05816. doi:arXiv:
          <fpage>2306</fpage>
          .
          <fpage>05816</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Labonne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Moran</surname>
          </string-name>
          , Spam-t5:
          <article-title>Benchmarking large language models for few-shot email spam detection</article-title>
          ,
          <year>2023</year>
          . URL: http://arxiv.org/abs/2304.01238. doi:arXiv:
          <fpage>2304</fpage>
          .
          <fpage>01238</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Desolda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. S.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Marrella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Catarci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Costabile</surname>
          </string-name>
          ,
          <article-title>Human factors in phishing attacks: A systematic literature review</article-title>
          ,
          <year>2021</year>
          . URL: https://doi.org/10.1145/3469886. doi:
          <volume>10</volume>
          . 1145/3469886.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Wogalter</surname>
          </string-name>
          , Habituation, dishabituation, and
          <article-title>recovery efects in visual warnings</article-title>
          ,
          <source>Human Factors and Ergonomics Society Annual Meeting</source>
          <volume>53</volume>
          (
          <year>2009</year>
          )
          <fpage>1612</fpage>
          -
          <lpage>1616</lpage>
          . URL: https://journals.sagepub.com/doi/abs/10.1177/154193120905302015. doi:
          <volume>10</volume>
          .1177/ 154193120905302015.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B. B.</given-names>
            <surname>Anderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. B.</given-names>
            <surname>Kirwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Jenkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Eargle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Howard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vance</surname>
          </string-name>
          ,
          <article-title>How polymorphic warnings reduce habituation in the brain: Insights from an fmri study</article-title>
          ,
          <source>in: ACM Conference on Human Factors in Computing Systems</source>
          , ACM, Seoul, Republic of Korea,
          <year>2015</year>
          , pp.
          <fpage>2883</fpage>
          -
          <lpage>2892</lpage>
          . URL: https://doi.org/10.1145/2702123.2702322. doi:
          <volume>10</volume>
          .1145/ 2702123.2702322.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bravo-Lillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Cranor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Downs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Komanduri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sleeper</surname>
          </string-name>
          ,
          <article-title>Improving computer security dialogs</article-title>
          , in: International Conference on Human-Computer Interaction, volume LNCS, Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2011</year>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>35</lpage>
          . URL: https://dl.acm. org/doi/10.5555/2042283.2042286.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Buono</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Desolda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Greco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Piccinno</surname>
          </string-name>
          ,
          <article-title>Let warnings interrupt the interaction and explain: designing and evaluating phishing email warnings</article-title>
          ,
          <source>in: CHI Conference on Human Factors in Computing Systems</source>
          , volume EA, ACM, Hamburg Germany,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . URL: https://dl.acm.org/doi/abs/10.1145/3544549.3585802. doi:
          <volume>10</volume>
          .1145/3469886.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Desolda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Aneke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ardito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lanzilotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Costabile</surname>
          </string-name>
          ,
          <article-title>Explanations in warning dialogs to help users defend against phishing attacks</article-title>
          ,
          <year>2023</year>
          . URL: https://www.sciencedirect. com/science/article/pii/S1071581923000654. doi:https://doi.org/10.1016/j.ijhcs.
          <year>2023</year>
          .
          <volume>103056</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dodge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Etzioni</surname>
          </string-name>
          ,
          <article-title>Green ai</article-title>
          ,
          <source>Communications of the ACM</source>
          <volume>63</volume>
          (
          <year>2020</year>
          )
          <fpage>54</fpage>
          -
          <lpage>63</lpage>
          . URL: https://doi.org/10.1145/3381831. doi:
          <volume>10</volume>
          .1145/3381831.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Alshaher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>A new term weight scheme and ensemble technique for authorship identification</article-title>
          ,
          <source>in: International Conference on Compute and Data Analysis</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , Silicon Valley, CA, USA,
          <year>2020</year>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>130</lpage>
          . URL: https://doi.org/10.1145/3388142.3388159. doi:
          <volume>10</volume>
          . 1145/3388142.3388159.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Roxas</surname>
          </string-name>
          (Ed.),
          <source>Stylometric Studies based on Tone and Word Length Motifs, Pacific Asia Conference on Language, Information and Computation</source>
          , The National University (Phillippines),
          <year>2017</year>
          . URL: https://aclanthology.org/Y17-1011.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Sarzaeim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Doshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Mahmoud</surname>
          </string-name>
          ,
          <article-title>A framework for detecting ai-generated text in research publications</article-title>
          , in: International Conference on Advanced Technologies, volume
          <volume>11</volume>
          , Istanbul, Turkey,
          <year>2023</year>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>127</lpage>
          . URL: https://proceedings.icatsconf.org/conf/index. php/ICAT/article/view/36. doi:
          <volume>10</volume>
          .58190/icat.
          <year>2023</year>
          .
          <volume>28</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <article-title>C=5</article-title>
          , degree=1, gamma=
          <volume>0</volume>
          .01, kernel='poly', random_state=42 booster='gbtree',
          <source>eta=0</source>
          .01,
          <issue>gamma</issue>
          =0, min_child_weight=1, random_state=42 max_depth=3,
          <string-name>
            <surname>C=</surname>
          </string-name>
          100, penalty='l2',
          <source>dom_state=42 leaf_size=1, n_neighbors=1</source>
          , p=
          <fpage>1</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>