<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Assessment of LLM-Generated Smart Contracts in Ethereum</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mirco Vella</string-name>
          <email>vellamirco@libero.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Emanuele Cinà</string-name>
          <email>antonio.cina@unige.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marina Ribaudo</string-name>
          <email>marina.ribaudo@unige.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio Roli</string-name>
          <email>fabio.roli@unige.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Large Language Models, Smart Contract Generation, Auditing Tools, Program Verification, Security</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DIBRIS, Università di Genova</institution>
          ,
          <addr-line>Via Dodecaneso, 3, Genova</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>DIEE, Università di Cagliari</institution>
          ,
          <addr-line>Via Marengo, 09123 Cagliari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Blockchain technology and smart contracts are increasingly used in finance, government, and industry. At the same time, Large Language Models (LLMs) ofer new possibilities, such as generating smart contracts from natural language, which can lower costs and speed up development. However, smart contracts are often vulnerable to security flaws, whether written by humans or AI, leading to significant financial losses. This study systematically evaluates the quality and security of smart contract code generated by LLMs in the Ethereum blockchain ecosystem. We generated 250 smart contracts using two state-of-the-art models, GPT-4 and DeepSeek-Coder, and assessed their security using automated vulnerability detection tools, Slither and Mythril. Our findings reveal that while LLM-generated smart contracts exhibit improvements in syntactic correctness and coherence, they still sufer from critical security vulnerabilities, making them unsuitable for fully autonomous development.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The integration of Large Language Models (LLMs) into software development remains an active area of
research, driven by the rapid emergence of diverse models with distinct capabilities [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. These models
demonstrate significant potential in automating various programming tasks, streamlining software
development through the generation of boilerplate code [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], function completion from minimal input [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
and assistance in debugging [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Reducing the time required for coding and troubleshooting allows
developers to focus on strategic and creative aspects of their work [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ].
      </p>
      <p>
        One particularly promising yet challenging application of LLMs is smart contract development for
blockchain systems. As a core component of programmable blockchain platforms like Ethereum [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
smart contracts enable the automated execution of agreements [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. However, a major concern when
using LLMs to generate code is their lack of awareness of security risks. Studies have shown that
AI-generated code can introduce vulnerabilities that compromise security standards [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In the case
of smart contracts, such flaws can lead to severe financial losses and security breaches [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]. As a
consequence, the use of LLMs for smart contract development requires careful evaluation to determine
whether AI-generated code meets necessary security and reliability standards. In this regard, auditing
tools ofer a valuable means of assessing the security of the code, and this paper aims to address the
following research question:
      </p>
      <p>RQ. “Can LLMs accelerate smart contract development while ensuring high-quality, secure code?”</p>
      <p>To explore this question, we used a dataset of diverse prompts for smart contract generation which
were processed by two AI models, producing code that was subsequently analyzed using two auditing
tools. The findings provide insights into whether LLM-generated smart contracts are secure and reliable.</p>
      <p>CEUR</p>
      <p>ceur-ws.org</p>
      <p>The paper is structured as follows. Section 2 provides background information on blockchain
technology, with a focus on the Ethereum platform, and introduces the notion of LLMs. Furthermore, it
also reviews relevant literature on AI-assisted smart contract development. Section 3 details the data
collection process used to construct the dataset for evaluating the selected LLMs and the methodology
for generating smart contracts automatically. Section 4 presents the experimental results, assessing key
metrics such as contract compilability and vulnerability detection. Finally, Section 5 summarizes the
ifndings and discusses potential directions for future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries and Related Work</title>
      <p>This section provides a high-level overview of the key concepts essential for understanding the content
and terminology used throughout this paper.</p>
      <sec id="sec-2-1">
        <title>2.1. Smart contracts vulnerabilities</title>
        <p>
          Blockchain is a decentralized architecture designed to securely record transactions and data in an
immutable ledger structured as a chain of blocks, maintained through consensus among peers. Beyond
basic data storage, many blockchain platforms, such as Ethereum [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], support the execution of smart
contracts, which are self-executing programs that enforce agreements between parties without the
need for intermediaries. First envisioned by Szabo [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], smart contracts enable the development of
decentralized trusted applications known as dApps [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>
          Despite their advantages, smart contracts are susceptible to critical vulnerabilities1 that can
compromise their functionality and security [
          <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
          ]. Examples include exceeding the gas limit, causing
denial-of-service, miner manipulation of block.timestamp, and reentrancy attacks, such as the DAO
hack [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. These vulnerabilities underscore the necessity for systematic development and rigorous
security auditing to ensure robust, reliable smart contracts and minimize financial risks [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Aligned
with this need, Destefanis et al. [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] stress adopting strong software engineering practices for creating
a dedicated discipline for smart contract development, citing major incidents like the Parity wallet
attack [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] that resulted in significant losses. Kim and Ryu [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] classify security methods into static
analysis for vulnerability detection and correctness, plus dynamic analysis, identifying key challenges and
future directions. Sendner et al. [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] evaluate automated tools like Slither [
          <xref ref-type="bibr" rid="ref23 ref24">23, 24</xref>
          ] and Mythril [
          <xref ref-type="bibr" rid="ref25 ref26">25, 26</xref>
          ],
concluding that combining multiple scanners is essential for efective vulnerability detection.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Large Language Models</title>
        <p>
          Large Language Models (LLMs) leverage transformer architectures [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] for diverse language processing
tasks. Trained on extensive datasets of natural language and code [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], they develop strong language
understanding, generation, and in-context learning capabilities. LLMs have become essential tools across
multiple fields, enabling applications such as scam detection [
          <xref ref-type="bibr" rid="ref29 ref30 ref31">29, 30, 31</xref>
          ], automated code generation [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ],
code repair [
          <xref ref-type="bibr" rid="ref33 ref34">33, 34</xref>
          ], and reverse deobfuscation [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ]. Their integration into software engineering has
notably accelerated code synthesis and automation workflows [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ], including the generation of
smart contracts for blockchain platforms. Given their growing relevance and production for code
synthesis, researchers have examined the security of LLM-generated code more broadly. He and
Vechev [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] evaluate security risks through adversarial testing, guiding LLMs to generate both secure
and intentionally unsafe code. Similarly, Pearce et al. [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] manually inspect GitHub Copilot-generated
code, finding that approximately 40% contains security vulnerabilities. Khoury et al. [37] extend this
analysis to ChatGPT-generated code, corroborating similar risk levels. More recently, Li et al. [38]
provide a comparative assessment across multiple LLMs, reinforcing these findings. In the context
of smart contract generation via LLMs, Karanjai et al. [39] compare Google PaLM2 and GPT-3.5
in generating Solidity smart contracts from natural language descriptions. While PaLM2 achieved
1The OWASP website [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] lists the top 10 vulnerabilities.
higher accuracy, both models frequently produced subtle bugs and exhibited poor coding practices,
highlighting concerns about their reliability for production deployment. Similarly, Napoli et al. [40]
evaluated ChatGPT’s ability to identify and fix vulnerabilities in smart contracts, finding a success rate
of 57.1% after multiple attempts. They conclude that while LLMs can significantly assist developers,
they cannot fully replace expert human oversight, particularly for security-critical tasks. Barbàra et
al. [41] assess GPT-4’s ability to autonomously generate Solidity smart contracts from legal documents
using multiple prompt variants. Their evaluation, limited to 80 lease-agreement contracts and relying
solely on Slither for analysis, reveals that GPT-4 struggles with producing production-ready code
due to subtle bugs and inconsistencies between prompts and outputs. However, the study does not
incorporate advanced symbolic execution analysis tools (e.g., Mythril [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]), which limits the depth of
vulnerability detection and prevents classification of issues by severity. Olivieri et al. [ 42] extend this
line of work to Hyperledger Fabric [43, 44], evaluating LLMs’ ability to generate secure smart contracts
in Go from natural language prompts. Their results reveal that while LLMs can accelerate development,
the generated contracts often require substantial debugging and manual intervention due to quality
and security deficiencies.
        </p>
        <p>Our work builds upon these eforts by conducting a security assessment of LLM-generated smart
contracts. We extend prior research by (1) focusing on Ethereum-based contracts, (2) comparing
both commercial and open-source state-of-the-art LLMs for code synthesis, and (3) evaluating the
efectiveness of multiple static analysis tools, rather than relying solely on dynamic analysis or manual
review. Lastly, complementary to prior studies, our analysis not only identifies the vulnerabilities but
also classifies their severity, revealing the critical risks posed by certain flaws in the generated contracts.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Data Collection, Contract Generation and Analysis</title>
      <p>This section outlines the methodology (illustrated in Figure 1) employed for generating smart contracts
using LLMs and details the experimental workflow designed to evaluate their performance and security. 2</p>
      <p>Contract
Generation</p>
      <p>Sanitization
and Correction</p>
      <p>Contract
Analysis</p>
      <p>Results</p>
      <p>Create a Solidity smart
contract for an ERC20 token
named 'ExampleToken' ...</p>
      <p>GTP-4</p>
      <p>DeepSeek-Coder</p>
      <sec id="sec-3-1">
        <title>3.1. Dataset construction</title>
        <p>In our context, a prompt is the natural language input that guides an LLM to generate functional smart
contract code [45]. To ensure reproducibility and enable systematic evaluation, a well-structured prompt
dataset is essential. For this study, we adopt a dataset consisting of predefined, structured prompts
that explicitly instruct the model to generate smart contracts from natural language descriptions. We
selected an external prompt dataset [46] hosted on Hugging Face [47, 48], a prominent repository for
machine learning resources.</p>
        <p>To better understand the provenance and the structure of the dataset, we consulted its creators.
According to their documentation, the dataset was created using the Meta LLaMA 3 8B Instruct
2For the code and the chosen prompts refer to https://github.com/vmirco/Ethereum-Smart-Contract-Generation-and-Analysis.
model [49], which processed an initial corpus [50] of Solidity smart contracts sourced from public
GitHub repositories. For each contract, the model generated three distinct prompt levels—Beginner,
Average, Expert—with progressively increasing complexity. Listing 1 presents an example from a single
entry in the dataset, illustrating one prompt at the Average level, which is the one adopted in our study.
""" Create a smart contract that builds upon the WETHOmnibridgeRouter contract by adding
features for account registration, token wrapping, and relay. The contract should
integrate with the Omnibridge and WETH contracts. Include methods for registering
and wrapping tokens, as well as functionality for relaying tokens to specific
recipients. The contract should emit events upon successful token wrapping and
relaying. Consider implementing error handling and validation checks for user input.</p>
        <sec id="sec-3-1-1">
          <title>Listing 1: Average level prompt example.</title>
          <p>Contract Generation: Instructions for the LLMs. We designed prompt templates simulating typical
user inputs via graphical interfaces, specifying contract features without detailed implementation
instructions. This approach evaluates the ability of LLMs to translate natural language specifications
into coherent, deployable Solidity code. For instance, Listing 2 shows a prompt for generating an ERC20
token contract.
""" Create a Solidity smart contract for an ERC20 token named 'ExampleToken' ('EXT')
with a supply of 1,000,000 tokens. Implement transfer, approve, and transferFrom
functions. Include owner-only minting and token burning functionalities.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Listing 2: Prompt example for ERC20 tokens.</title>
          <p>To ensure consistent and deployable output, prompts explicitly require Solidity code compatible with
version 0.8.0, as shown in Listing 3. This fixed version avoids discrepancies in compilation and analysis,
eliminating the need for manual adjustments across diferent compiler versions.
""" You will generate deployable smart contract code in Solidity based on the prompt I
provide. Use Solidity version ^0.8.0.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>Listing 3: Instruction template.</title>
          <p>In practice, LLMs outputs sometimes included extraneous text (e.g., introductory phrases, import
statements, or markdown syntax) that broke compilation. We therefore refined the instructions to
restrict output to pure Solidity code and to replace import directives with inlined code to avoid
unresolved dependencies (Listing 4).
""" The output should contain only Solidity code — no comments or markdown such as "```
sol". I should be able to copy your response and paste it in a sol file to deploy.
Do not use import statement, only code, if there's any import, replace it with code
for the actual imported contract.</p>
        </sec>
        <sec id="sec-3-1-4">
          <title>Listing 4: Instruction template (cnt).</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Contract Analysis: Slither and Mythril</title>
        <p>
          To identify potential vulnerabilities in generated smart contracts, we employed two state-of-the-art
tools: Slither [
          <xref ref-type="bibr" rid="ref23 ref24">23, 24</xref>
          ] and Mythril [
          <xref ref-type="bibr" rid="ref25 ref26">25, 26</xref>
          ]. These tools complement each other by combining static and
symbolic analysis techniques, enabling a thorough security assessment.
        </p>
        <p>Slither. Slither is a Python-based static analyzer for Solidity and Vyper that uses an intermediate
representation to detect vulnerabilities. It provides severity classifications for findings: Informational,
Optimization, Low, Medium, and High. The first two categories ofer insights and recommendations for
code quality and gas eficiency but do not necessarily indicate security flaws. The latter three denote
increasing severity levels of security risks, helping prioritize remediation eforts. For each contract, we
ran Slither analysis and saved its output. If compilation errors or warnings were reported, the analysis
for that contract was halted. Otherwise, identified vulnerabilities were classified and aggregated to
quantify the overall security risk.</p>
        <p>Mythril. Mythril performs symbolic execution on Ethereum Virtual Machine bytecode, systematically
exploring execution paths to detect vulnerabilities. Given the high computational cost of symbolic
analysis, we limited Mythril’s runtime to 20 minutes per contract.3 Mythril outputs vulnerabilities
categorized by severity levels (Low, Medium, and High) and suggests potential mitigations. Our analysis
proceeded in two phases: first, vulnerabilities were grouped by severity to evaluate criticality; second,
they were classified by type to identify the most common security threats.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>This section presents the results of the comparative analysis of 250 Solidity smart contracts generated
using GPT-4 and DeepSeek-Coder. The prompts used for code generation were randomly selected from
the initial dataset, focusing on the Average prompt level. The primary objective of this analysis is to
evaluate the quality of the generated contracts in terms of errors, vulnerabilities, and adherence to
blockchain coding best practices.</p>
      <sec id="sec-4-1">
        <title>4.1. Code Compilation Success Rate</title>
        <p>A key indicator of smart contract reliability is whether the code compiles successfully. Compilation
failures render contracts unusable and typically result from either syntactic errors or missing
dependencies. The latter often stems from the use of import statements that reference unavailable external
libraries. To reduce this issue, we explicitly instructed both models to avoid using import statements
(see Listing 4). Nonetheless, some generated contracts included such statements, leading to failed
analyses by Slither, which efectively detects missing dependencies and syntax issues.</p>
        <p>Our results reveal a significant performance gap between the two models. Using Slither, we found
that DeepSeek-Coder produced only 46 non-compilable contracts (18% of its total), while GPT-4 failed
to compile 139 contracts (55%). Focusing on import-related errors, GPT-4 generated 14 contracts with
forbidden import statements, compared to just 2 in the DeepSeek-Coder set. These findings suggest
two key diferences. First, DeepSeek-Coder is more capable of generating syntactically correct Solidity
code, likely due to its specialization in code generation. Second, DeepSeek-Coder demonstrates better
adherence to prompt instructions. After being asked to avoid imports, it generally embedded the
required logic directly into the code. In contrast, GPT-4 often ignored the instruction and continued to
use import statements. Although import-related errors are easily fixed in a real development context,
their presence highlights diferences in model behavior. Developers typically have access to standard
libraries, but models that rely on unavailable imports during generation are less robust in isolated or
constrained environments.</p>
        <p>To further assess potential usability of LLMs code smart contracts synthesis, we manually corrected
contracts with minor syntax issues or import errors. After this intervention, the non-compilation
3Analyzing 250 contracts required approximately 80 hours of cumulative runtime, balancing thoroughness and eficiency.
rate for GPT-4 dropped to 35.74%, and for DeepSeek-Coder to 15.52%, confirming the latter’s superior
reliability in generating compilable smart contracts.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Average Vulnerabilities per Type</title>
        <p>After sanitizing the smart contracts generated by the two LLMs, we analyzed the number of
vulnerabilities detected by Slither. Among all contracts generated by GPT-4, 35.74% were free of vulnerabilities,
while 28.52% contained at least one security issue. The remaining 35% were excluded from this analysis
due to compilation failure. In practice, this means that approximately one out of three contracts
produced by GPT-4 compiled successfully and was free of vulnerabilities. For DeepSeek-Coder, among
the 84.48% of contracts that compiled successfully, 68.96% were found to contain vulnerabilities.
Despite being optimized for code generation, this indicates that roughly two-thirds of DeepSeek-Coder’s
compilable contracts were still insecure.</p>
        <p>To further understand the nature of these issues, we analyzed the distribution of vulnerability
severities within the subset of compilable and vulnerable contracts, using both Mythril and Slither. As
shown in Table 1, both models exhibited a consistent pattern: Low-severity vulnerabilities appeared most
frequently, followed by Medium and then High severity issues. Lastly, we observe a significant diference
in the output between the two tools. Slither reports additional categories, such as Informational and
Optimization, which collectively account for over 80% of its findings, as reflected in the last two columns
of the table. The Informational category highlights general coding issues or patterns that may afect
readability or maintainability, while Optimization refers to gas-ineficient code that, although not
directly insecure, could lead to increased execution costs on-chain.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Average Vulnerabilities per Contract</title>
        <p>In this section we provide an additional analysis conducted with Slither and Mythril focusing on the
average number of vulnerabilities detected per contract.</p>
        <p>Using Slither. Both models yield comparable results, with GPT-4 showing a slight advantage when
considering all vulnerability categories. The averages are calculated from the subset of vulnerable
contracts and thus do not represent the entire set of generated contracts. An average of seven
vulnerabilities per contract was found and it may seem significant, but it is important to notice that this figure
drops to approximately one when the Informational and Optimization categories are excluded. In this
refined evaluation, DeepSeek-Coder exhibits a slight advantage.</p>
        <p>Using Mythril. Before interpreting these results, it is important to recall that Mythril was executed
with a processing time constraint of 20 minutes per contract. This limitation may have afected the
accuracy of the findings compared to the full capabilities of the tool. Table 2 presents the vulnerabilities
identified by Mythril, classified according to their SWC (Smart Contract Weakness Classification)
tags [51]. Although GPT-4 produced a higher number of non-compilable contracts, most of its compilable
contracts were secure, resulting in 64 detected vulnerabilities. In contrast, contracts generated by
DeepSeek-Coder exhibited 130 vulnerabilities. The most prevalent issue was reentrancy (SWC-107), a
well-known vulnerability neither model could mitigate efectively, highlighting the challenges LLMs face
in understanding the complexities of self-executing functions. Additionally, DeepSeek-Coder showed
a higher occurrence of SWC-116, which relates to the use of built-in variables such as block.number
and block.timestamp to trigger time-dependent events. Another notable vulnerability, unprotected
Ether withdrawal (SWC-105), was found in contracts generated by both models at similar rates. This
vulnerability is particularly critical, as it allows unauthorized users to withdraw contract funds in an
uncontrolled manner, potentially leading to financial loss.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Contract Code Analysis</title>
        <p>In this section, we analyze one representative smart contract generated by each model, starting with
DeepSeek-Coder. The prompt provided to both models is shown in Listing 5, and an excerpt of the
generated Solidity code is presented in Listing 6.</p>
        <p>""" Create a smart contract for a
cryptocurrency-based miner engineer
game. The contract should allow
players to buy and sell engineer
characters, buy boosters for mining,
and change virus types. The
contract should have an interface
for mining war games and mini-games.</p>
        <p>The contract structure should
include defining structs for miner
and boost data, mapping player
information, and implementing
functions for buying and selling
engineers. Implement the contract
functionality by writing Solidity
code that mirrors the functionality
of the provided code snippet.</p>
        <p>Listing 5: Input Prompt.</p>
        <p>}</p>
        <p>The code produced by DeepSeek-Coder contains critical issues related to uninitialized variables,
specifically the miners and boosters mappings. These state variables are defined but accessed without
prior initialization, as flagged by Slither. In Solidity, reading from an uninitialized mapping entry
returns a default-constructed instance. As shown in lines 2 and 3, this behavior may allow a malicious
actor to acquire a Miner with no cost, since its price defaults to zero. Such a vulnerability could be
exploited to gain unfair advantages in application logic, such as mining or trading, without spending
any resources. This example highlights the importance of explicitly initializing state variables to ensure
correct behavior and prevent unintended vulnerabilities.</p>
        <p>By contrast, the code generated by GPT-4 for the same prompt avoids this issue. It enforces a check
for suficient user balance (e.g., at least 1 Ether) before creating a Miner, and it initializes the Miner
instance with explicit attributes. This approach prevents access to default-constructed mappings by
ensuring that objects are created only under valid conditions. The solution generated by GPT-4 therefore
reflects better coding practices and more proactive vulnerability prevention.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Discussion</title>
        <p>As shown, both GPT-4 and DeepSeek-Coder were able to generate smart contracts that responded to
prompt requirements, though their outputs frequently contained vulnerabilities. GPT-4 demonstrated
stronger alignment with complex prompts but produced a higher proportion of non-compilable contracts,
mainly due to unresolved import statements and syntactic inconsistencies. In contrast, DeepSeek-Coder
showed greater syntactic reliability, with fewer than 20% of its contracts failing to compile. However, its
outputs contained more vulnerabilities, especially those related to reentrancy (SWC-107) and improper
timestamp usage (SWC-116).</p>
        <p>Our comparative analysis revealed recurring issues in LLM-generated code, including unchecked
return values, flawed authorization mechanisms, and logical errors. While Slither provided broader
categories, including Informational and Optimization insights, Mythril ofered more granular detection of
specific vulnerability types. Reentrancy, misuse of built-in variables, and unprotected Ether withdrawals
were among the most common issues across both models.</p>
        <p>In summary, while LLMs exhibit promising capabilities in automating smart contract development,
our findings indicate that they are not yet reliable enough to ensure the security and correctness required
for deployment in real-world blockchain environments. Revisiting our research question, “Can LLMs
accelerate smart contract development while ensuring high-quality, secure code?”, the empirical evidence
points toward a cautious conclusion. Although LLMs can support early-stage prototyping and reduce
development efort, they currently fall short of meeting the standards necessary for producing robust,
vulnerability-free code without human oversight.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This work investigated the capability of LLMs, namely GPT-4 and DeepSeek-Coder, to autonomously
generate secure Solidity smart contracts for the Ethereum blockchain. Using a dataset of predefined
prompts and leveraging static and symbolic analysis tools, we evaluated contract correctness, adherence
to prompt specifications, and vulnerability presence. Our results suggest that while LLMs can generate
syntactically correct and functionally relevant smart contracts, their outputs are not yet suficiently
reliable for fully autonomous deployment. The generated contracts frequently require manual review
and correction to address security weaknesses.</p>
      <p>This study has some limitations that may afect the generalizability of its results. The dataset of 250
smart contracts may not fully capture the diversity and complexity of real-world contracts. Additionally,
the use of predefined prompt templates could influence model outputs and introduce bias. However,
the dataset size and methodology remain consistent with or larger than those used in related studies,
providing a meaningful basis for comparison and analysis.</p>
      <p>Future research should focus on improving the ability of LLMs to handle common smart contract
vulnerabilities like reentrancy and to better understand the context of decentralized systems. Using LLMs
as supportive tools within development teams, working alongside experts, can enhance their usefulness
while reducing potential errors. Since LLM technology is advancing quickly, regular assessment will
help guide how these models can be safely and efectively applied in smart contract development.
Acknowledgment
This work was partially supported by project FISA-2023-00128 funded by the MUR program “Fondo
italiano per le scienze applicate”; the EU—NGEU National Sustainable Mobility Center (CN00000023),
Italian Ministry of University and Research Decree n. 1033—17/06/2022 (Spoke 10); and projects SERICS
(PE00000014) and FAIR (PE00000013) under the MUR National Recovery and Resilience Plan funded by
the European Union - NextGenerationEU.</p>
      <p>Declaration on Generative AI</p>
      <sec id="sec-5-1">
        <title>The author(s) have not employed any Generative AI tools.</title>
        <p>(SP), IEEE, 2022, pp. 754–768.
[37] R. Khoury, A. R. Avila, J. Brunelle, B. M. Camara, How secure is code generated by ChatGPT?,
in: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, 2023, pp.
2445–2451.
[38] R. Li, L. B. Allal, Y. Zi, N. Muennighof, D. Kocetkov, C. Mou, M. Marone, C. Akiki, J. Li, J. Chim,
et al., Starcoder: may the source be with you!, arXiv preprint arXiv:2305.06161 (2023).
[39] R. Karanjai, E. Li, L. Xu, W. Shi, Who is Smarter? An Empirical Study of AI-Based Smart Contract
Creation, in: 2023 5th Conference on Blockchain Research and Applications for Innovative
Networks and Services (BRAINS), 2023, pp. 1–8. doi:10.1109/BRAINS59668.2023.10316829.
[40] E. A. Napoli, V. Gatteschi, Evaluating ChatGPT for Smart Contracts Vulnerability Correction, in:
2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), 2023, pp.
1828–1833. doi:10.1109/COMPSAC57700.2023.00283.
[41] F. Barbàra, E. A. Napoli, V. Gatteschi, C. Schifanella, Automatic smart contract generation through
llms: When the stochastic parrot fails, in: 6th Distributed Ledger Technology Workshop, 2024.
[42] L. Olivieri, D. Beste, L. Negrini, L. Schönherr, A. E. Cinà, P. Ferrara, Code generation of smart
contracts with llms: A case study on hyperledger fabric, in: 2025 IEEE 36th International Symposium
on Software Reliability Engineering (ISSRE), IEEE, 2025.
[43] L. Olivieri, L. Negrini, V. Arceri, P. Ferrara, A. Cortesi, Detection of read-write issues in
hyperledger fabric smart contracts, in: Proceedings of the 40th ACM/SIGAPP Symposium on Applied
Computing, 2025, pp. 329–337.
[44] L. Olivieri, L. Negrini, V. Arceri, B. Chachar, P. Ferrara, A. Cortesi, Detection of phantom reads in
hyperledger fabric, IEEE Access 12 (2024) 80687–80697. doi:10.1109/ACCESS.2024.3410019.
[45] R. L. Jonas Oppenlaender, J. Silvennoinen, Prompting AI Art: An Investigation into the Creative
Skill of Prompt Engineering, International Journal of Human–Computer Interaction 0 (2024)
1–23. URL: https://doi.org/10.1080/10447318.2024.2431761. doi:10.1080/10447318.2024.2431761.
arXiv:https://doi.org/10.1080/10447318.2024.2431761.
[46] braindao, Solidity Dataset with Prompts, 2024. URL: https://huggingface.co/datasets/braindao/
solidity-dataset.
[47] C. Osborne, J. Ding, H. R. Kirk, The AI community building the future? A quantitative
analysis of development activity on Hugging Face Hub, Journal of Computational Social
Science 7 (2024) 2067–2105. URL: http://dx.doi.org/10.1007/s42001-024-00300-8. doi:10.1007/
s42001- 024- 00300- 8.
[48] I. HuggingFace, Huggin Face, 2016. URL: https://huggingface.co/.
[49] Meta, The LLaMA 3 Herd of Models, 2024. URL: https://arxiv.org/abs/2407.21783.</p>
        <p>arXiv:2407.21783.
[50] seyyedaliayati, Solidity Dataset, 2023. URL: https://huggingface.co/datasets/seyyedaliayati/
solidity-dataset.
[51] SWCRegistry, Smart Contract Weakness Classification, https://swcregistry.io/, 2025. Accessed:
2025-07-26.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Roziere</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gehring</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gloeckle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sootla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Gat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. E.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Adi</surname>
          </string-name>
          , J. Liu,
          <string-name>
            <given-names>T.</given-names>
            <surname>Remez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rapin</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Code</surname>
            <given-names>LLaMA</given-names>
          </string-name>
          :
          <article-title>Open foundation models for code</article-title>
          ,
          <source>arXiv:2308.12950</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Research</surname>
          </string-name>
          ,
          <year>CodeGen2</year>
          .5: Small, but Mighty, https://blog.salesforceairesearch.com/codegen25/,
          <year>2023</year>
          . Accessed:
          <fpage>2025</fpage>
          -
          <lpage>07</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Eghbali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tip</surname>
          </string-name>
          ,
          <article-title>An empirical evaluation of using large language models for automated unit test generation</article-title>
          ,
          <source>IEEE Transactions on Software Engineering</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Dohmke</surname>
          </string-name>
          , Github copilot is generally available to all developers,
          <year>2022</year>
          . Https://github.blog/2022- 06-21
          <article-title>-github-copilot-is-generally-available-to-all-developers/</article-title>
          (Accessed:
          <fpage>2025</fpage>
          -
          <lpage>07</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Majdoub</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Ben Charrada</surname>
          </string-name>
          ,
          <article-title>Debugging with open-source large language models: An evaluation</article-title>
          ,
          <source>in: Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>510</fpage>
          -
          <lpage>516</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ziegler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kalliamvakou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Simister</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sittampalam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Rice</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rifkin</surname>
          </string-name>
          , E. Aftandilian,
          <article-title>Productivity assessment of neural code completion</article-title>
          ,
          <source>6th ACM SIGPLAN International Symposium on Machine Programming</source>
          (
          <year>2022</year>
          ). URL: https://doi.org/10.1145/3520312.3534864. doi:
          <volume>10</volume>
          .1145/ 3520312.3534864.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tabachnyk</surname>
          </string-name>
          ,
          <article-title>ML-enhanced code completion improves developer productivity</article-title>
          ,
          <year>2022</year>
          . URL: https: //blog.research.google/
          <year>2022</year>
          /07/ml-enhanced
          <article-title>-code-completion-improves</article-title>
          .
          <source>html?m=1.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Wood</surname>
          </string-name>
          , Ethereum:
          <string-name>
            <given-names>A Secure</given-names>
            <surname>Decentralised Generalised Transaction Ledger</surname>
          </string-name>
          , Ethereum Project Yellow Paper (
          <year>2014</year>
          ). URL: https://cryptodeep.ru/doc/paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Oliva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. M.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <article-title>An exploratory study of smart contracts in the Ethereum blockchain platform</article-title>
          ,
          <source>Empirical Software Engineering</source>
          <volume>25</volume>
          (
          <year>2020</year>
          )
          <fpage>1864</fpage>
          -
          <lpage>1904</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vechev</surname>
          </string-name>
          ,
          <article-title>Large Language Models for Code: Security Hardening and Adversarial Testing</article-title>
          ,
          <source>in: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security</source>
          , CCS '23,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>1865</fpage>
          -
          <lpage>1879</lpage>
          . URL: https://doi.org/10.1145/3576915.3623175. doi:
          <volume>10</volume>
          .1145/3576915.3623175.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kalra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Goel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dhawan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <article-title>Zeus: analyzing safety of smart contracts</article-title>
          ,
          <source>in: Ndss</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zamani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Phillips</surname>
          </string-name>
          ,
          <article-title>On the security risks of the blockchain</article-title>
          ,
          <source>Journal of Computer Information Systems</source>
          <volume>60</volume>
          (
          <year>2020</year>
          )
          <fpage>495</fpage>
          -
          <lpage>506</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>N.</given-names>
            <surname>Szabo</surname>
          </string-name>
          ,
          <article-title>Formalizing and securing relationships on public networks, First monday (</article-title>
          <year>1997</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <source>Blockchain-Based Decentralized Application: A Survey</source>
          ,
          <source>IEEE Open Journal of the Computer Society</source>
          <volume>4</volume>
          (
          <year>2023</year>
          )
          <fpage>121</fpage>
          -
          <lpage>133</lpage>
          . doi:
          <volume>10</volume>
          .1109/OJCS.
          <year>2023</year>
          .
          <volume>3251854</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>OWASP</given-names>
            <surname>Foundation</surname>
          </string-name>
          ,
          <source>OWASP Smart Contract Top</source>
          <volume>10</volume>
          , https://owasp.org
          <article-title>/ www-project-smart-</article-title>
          <string-name>
            <surname>contract-</surname>
          </string-name>
          top-
          <volume>10</volume>
          /,
          <year>2025</year>
          . Accessed:
          <fpage>2025</fpage>
          -
          <lpage>07</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Jiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Nan, A Survey of Ethereum Smart Contract Security: Attacks and Detection, Distrib</article-title>
          . Ledger Technol. (
          <year>2024</year>
          ). URL: https://doi.org/10.1145/3643895. doi:
          <volume>10</volume>
          .1145/3643895.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>V.</given-names>
            <surname>Arceri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Merenda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Dolcetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Negrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Olivieri</surname>
          </string-name>
          , E. Zafanella,
          <article-title>Towards a sound construction of EVM bytecode control-flow graphs</article-title>
          ,
          <source>in: Proceedings of the 26th ACM International Workshop on Formal Techniques for Java-like Programs</source>
          ,
          <source>FTfJP</source>
          <year>2024</year>
          , ACM,
          <year>2024</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>16</lpage>
          . doi:
          <volume>10</volume>
          .1145/3678721.3686227.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mehar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Giambattista</surname>
          </string-name>
          , E. Gong, G. Fletcher,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sanayhie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Laskowski</surname>
          </string-name>
          ,
          <article-title>Understanding a Revolutionary and Flawed Grand Experiment in Blockchain: The DAO Attack</article-title>
          , in
          <source>: Journal of Cases on Information Technology 21</source>
          ,
          <year>2024</year>
          . URL: arXiv:
          <fpage>2401</fpage>
          .
          <fpage>14196</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>G.</given-names>
            <surname>Destefanis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Marchesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tonelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bracciali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hierons</surname>
          </string-name>
          ,
          <article-title>Smart contracts vulnerabilities: a call for blockchain software engineering?</article-title>
          , in: 2018
          <source>International Workshop on Blockchain Oriented Software Engineering (IWBOSE)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>25</lpage>
          . doi:
          <volume>10</volume>
          .1109/IWBOSE.
          <year>2018</year>
          .
          <volume>8327567</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ferreira Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Iannillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gervais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>State</surname>
          </string-name>
          ,
          <article-title>The Eye of Horus: Spotting and Analyzing Attacks on Ethereum Smart Contracts</article-title>
          ,
          <source>in: Financial Cryptography and Data Security: 25th International Conference, FC</source>
          <year>2021</year>
          ,
          <string-name>
            <surname>Virtual</surname>
            <given-names>Event</given-names>
          </string-name>
          , March 1-
          <issue>5</issue>
          ,
          <year>2021</year>
          ,
          <string-name>
            <given-names>Revised</given-names>
            <surname>Selected</surname>
          </string-name>
          <string-name>
            <given-names>Papers</given-names>
            ,
            <surname>Part</surname>
          </string-name>
          <string-name>
            <surname>I</surname>
          </string-name>
          , Springer-Verlag, Berlin, Heidelberg,
          <year>2021</year>
          , p.
          <fpage>33</fpage>
          -
          <lpage>52</lpage>
          . URL: https://doi.org/10.1007/ 978-3-
          <fpage>662</fpage>
          -64322-
          <issue>8</issue>
          _2. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>662</fpage>
          -64322-
          <issue>8</issue>
          _
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ryu</surname>
          </string-name>
          ,
          <article-title>Analysis of Blockchain Smart Contracts: Techniques and Insights</article-title>
          , in: 2020
          <source>IEEE Secure Development (SecDev)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>65</fpage>
          -
          <lpage>73</lpage>
          . doi:
          <volume>10</volume>
          .1109/SecDev45635.
          <year>2020</year>
          .
          <volume>00026</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>C.</given-names>
            <surname>Sendner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Petzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dmitrienko</surname>
          </string-name>
          ,
          <article-title>Large-Scale Study of Vulnerability Scanners for Ethereum Smart Contracts</article-title>
          ,
          <source>in: 2024 IEEE Symposium on Security and Privacy (SP)</source>
          ,
          <source>IEEE Computer Society</source>
          , Los Alamitos, CA, USA,
          <year>2024</year>
          , pp.
          <fpage>220</fpage>
          -
          <lpage>220</lpage>
          . URL: https://doi.ieeecomputersociety.
          <source>org/10. 1109/SP54263</source>
          .
          <year>2024</year>
          .
          <volume>00230</volume>
          . doi:
          <volume>10</volume>
          .1109/SP54263.
          <year>2024</year>
          .
          <volume>00230</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Feist</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Grieco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Groce</surname>
          </string-name>
          , Slither:
          <string-name>
            <given-names>A Static</given-names>
            <surname>Analysis Framework for Smart Contracts</surname>
          </string-name>
          ,
          <year>2019</year>
          . URL: https://arxiv.org/abs/
          <year>1908</year>
          .09878.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Crytic</surname>
          </string-name>
          , Slither: The Smart Contract Static Analyzer, https://crytic.github.io/slither/slither.html,
          <year>2019</year>
          . Accessed:
          <fpage>2025</fpage>
          -
          <lpage>07</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>N.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <article-title>A Survey of Mythril, A Smart Contract Security Analysis Tool for EVM Bytecode</article-title>
          ,
          <source>Indian Journal of Natural Sciences</source>
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <fpage>51003</fpage>
          -
          <lpage>51010</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Consensys</surname>
          </string-name>
          , Mythril GitHub Repository,
          <year>2025</year>
          . URL: https://github.com/ConsenSys/mythril, accessed:
          <fpage>2025</fpage>
          -07-26.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Ł. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>T.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ryder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Subbiah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Neelakantan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shyam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sastry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Askell</surname>
          </string-name>
          , et al.,
          <article-title>Language models are few-shot learners</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>1877</fpage>
          -
          <lpage>1901</lpage>
          . URL: https://doi.org/10.5555/3495724.3495883. doi:
          <volume>10</volume>
          . 5555/3495724.3495883.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>B.</given-names>
            <surname>Acharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lazzaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>López-Morales</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oest</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Saad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Cinà</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Schönherr</surname>
          </string-name>
          , T. Holz,
          <article-title>The imitation game: exploring brand impersonation attacks on social media platforms</article-title>
          ,
          <source>in: Proceedings of the 33rd USENIX Conference on Security Symposium</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>B.</given-names>
            <surname>Acharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Saad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Cinà</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Schonherr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. D.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oest</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vadrevu</surname>
          </string-name>
          , T. Holz,
          <article-title>Conning the Crypto Conman: End-to-End Analysis of Cryptocurrency-based Technical Support Scams</article-title>
          ,
          <source>in: IEEE Symposium on Security and Privacy (SP)</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>17</fpage>
          -
          <lpage>35</lpage>
          . URL: https://doi.org/10.1109/ SP54263.
          <year>2024</year>
          .
          <volume>00156</volume>
          . doi:
          <volume>10</volume>
          .1109/SP54263.
          <year>2024</year>
          .
          <volume>00156</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>B.</given-names>
            <surname>Acharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lazzaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Cinà</surname>
          </string-name>
          , T. Holz,
          <article-title>Pirates of charity: Exploring donation-based abuses in social media platforms</article-title>
          ,
          <source>in: Proceedings of the ACM on Web Conference</source>
          <year>2025</year>
          ,
          <year>2025</year>
          , pp.
          <fpage>3968</fpage>
          -
          <lpage>3981</lpage>
          . URL: https://doi.org/10.1145/3696410.3714634. doi:
          <volume>10</volume>
          .1145/3696410.3714634.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tworek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ponde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Edwards</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Burda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Joseph</surname>
          </string-name>
          , G. Brockman, Others,
          <source>Evaluating large language models trained on code, arXiv:2107.03374</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>H.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Sanchez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gulwani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Verbruggen</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Radiček</surname>
          </string-name>
          ,
          <article-title>Repair is nearly generation: Multilingual program repair with llms</article-title>
          ,
          <source>in: AAAI Conference on Artificial Intelligence</source>
          ,
          <year>2023</year>
          . URL: https://doi.org/10.1609/aaai.v37i4.25642. doi:
          <volume>10</volume>
          .1609/aaai.v37i4.
          <fpage>25642</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <surname>L. Zhang,</surname>
          </string-name>
          <article-title>Copiloting the copilots: Fusing large language models with completion engines for automated program repair</article-title>
          ,
          <source>in: ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering</source>
          ,
          <year>2023</year>
          . URL: https://doi.org/10.1145/ 3611643.3616271. doi:
          <volume>10</volume>
          .1145/3611643.3616271.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>D.</given-names>
            <surname>Beste</surname>
          </string-name>
          , G. Menguy,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hajipour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fritz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Cinà</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bardin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Holz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eisenhofer</surname>
          </string-name>
          , L. Schönherr,
          <article-title>Exploring the potential of llms for code deobfuscation</article-title>
          ,
          <source>in: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment</source>
          , Springer,
          <year>2025</year>
          , pp.
          <fpage>267</fpage>
          -
          <lpage>286</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -97620-9_
          <fpage>15</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -97620-9_
          <fpage>15</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>H.</given-names>
            <surname>Pearce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dolan-Gavitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Karri</surname>
          </string-name>
          ,
          <article-title>Asleep at the keyboard? assessing the security of github copilot's code contributions</article-title>
          ,
          <source>in: 2022 IEEE Symposium on Security and Privacy</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>