<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>R. Stupnytskyi);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>UAV detection with YOLO on a standalone Raspberry Pi 5 system</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Iryna Shevchuk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yurii Kryvenchuk</string-name>
          <email>yurii.p.kryvenchuk@lpnu.ua</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rostyslav Stupnytskyi</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andriy Shevchuk</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Beata Kushka</string-name>
          <email>beatakushka@yahoo.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Digital Economics and Business Analytics, Ivan Franko National University of Lviv</institution>
          ,
          <addr-line>Lviv,79000</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Foreign Languages, Lviv Polytechnic National University</institution>
          ,
          <addr-line>Lviv,79013</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Systems of Artificial Intelligence, Lviv Polytechnic National University</institution>
          ,
          <addr-line>Lviv,79005</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Department of economic and management, University of Economics and Entrepreneurship</institution>
          ,
          <addr-line>Khmelnytskyi,29001</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>The rapid increase of uncrewed aerial vehicles (UAVs) created many concerns regarding security, privacy, and airspace regulation. Implementing real-time detection systems for UAVs is important for the prediction of threats from unauthorized aerial activities. This research critically examines the feasibility of deploying a real-time drone detection pipeline leveraging YOLOv8 on a Raspberry Pi 5, integrated with an RP Camera Module 3, as a self-sufficient, low-power detection apparatus. Systematically evaluated multiple YOLOv8 variants, including YOLOv8n, Selective Knowledge Distillation (SKD) applied to YOLOv8, and a hardwareoptimized implementation accelerated with HailoAI. Our methodological approach rigorously assesses these configurations concerning detection accuracy, computational overhead, and real-time inferencing capabilities. Through empirical analyses of key performance indicators like precision, recall, and frame rate - our findings underscore the efficacy of strategic model optimizations in system performance. The results show that Raspberry Pi 5 is a good platform for low-cost, real-time drone detection despite its inherent computational constraints. Also, analysed the trade-offs between detection accuracy and computational efficiency and investigate the role of hardware acceleration in improving the system speed. This study contributes substantively to embedded artificial intelligence and reinforces the pivotal role of lightweight deep learning architectures in enhancing security and surveillance applications in computationally restrictive environments.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;YOLOv8</kwd>
        <kwd>autonomous systems</kwd>
        <kwd>Raspberry Pi 5</kwd>
        <kwd>drone detection</kwd>
        <kwd>deep learning</kwd>
        <kwd>computer vision</kwd>
        <kwd>1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Drones are becoming increasingly useful in both civilian and military applications [1]. This is why
detection of them has become more significant for public security, infrastructure protection, and the
military [2]. This is because illegal drones are a big threat, they infringe on people’s privacy and
pose a threat to businesses, hence it is crucial to come up with effective ways of detecting them [3].
Traditional methods for instance through the use of radar and optical sensors can work but are
expensive, have high energy requirements and are limited in terms of mobility or decentralized use
[4]. This is where AI-driven approaches step in [5]. For instance, deep learning models implemented
on compact, power-efficient devices like the Raspberry Pi 5 are a good substitute [6]. These models
employ CNNs to evaluate images and can accurately detect UAVs and tell them apart from other
objects in real time [7]. Machine learning when combined with embedded systems and computer
vision, frameworks that run on Raspberry Pi give a lightweight and strong solution for monitoring
drones in environments with limited resources [8].</p>
      <p>This technology is especially useful in situations that demand constant monitoring, air port
security, battlefield surveillance, or critical infrastructure protection [9]. It’s becoming possible to
run deep learning models on low power hardware without a significant loss of accuracy because of
recent advances in optimization methods such as quantization, quantization, and knowledge
distillation [10].</p>
    </sec>
    <sec id="sec-2">
      <title>2. Analysis of recent publications</title>
      <p>Drones have become a prominent focus in the domain of real-time uncrewed aerial vehicle
detection, with contemporary research largely oriented toward refining deep learning models for
embedded systems [11,12]. The growing availability of affordable miniaturized hardware and
enhanced camera modules has further driven the development of compact, high-performance
detection algorithms. At the same time, the evolution of convolutional neural networks has fueled
the creation of object detection frameworks—such as Faster R-CNN, SSD, and YOLO—which have
been extensively employed in UAV identification tasks [11,12]. Despite their effectiveness, deploying
these models on resource-constrained devices like the Raspberry Pi 5 poses considerable challenges,
particularly in terms of computational efficiency, energy consumption, and inference latency [13].
Researchers often encounter issues like limited memory bandwidth and relatively modest GPU
capabilities, making it essential to optimize architectures for real-time drone surveillance.</p>
      <p>Conventional UAV detection methodologies frequently rely on radar-based and optical
sensor systems, which can be computationally expensive and require specialized, high-end
hardware. This dependence renders them less practical for mobile or edge-oriented operations,
where flexibility and low power consumption are critical [14]. By contrast, deep learning-based
detection techniques have enabled real-time UAV recognition through image processing, offering a
more adaptable alternative [15,16]. However, implementing such models on lightweight embedded
hardware demands thorough optimization to balance detection accuracy with computational
viability [17]. Techniques ranging from layer pruning to precision reduction (e.g., FP16 or INT8
arithmetic) can substantially decrease model size and processing load, yet they must be applied
carefully to avoid undue performance penalties in challenging scenarios like partial drone occlusion
or rapid movements.</p>
      <p>YOLO (You Only Look Once) stands out among existing real-time detection frameworks for
its rapid inference capabilities and robust detection accuracy [18]. In particular, YOLOv8
incorporates notable architectural refinements and advanced training mechanisms that further
elevate detection efficacy under diverse circumstances [19,20]. Ongoing efforts have examined a
variety of optimization methods—encompassing model quantization, pruning, and knowledge
distillation—to adapt YOLO for resource-constrained scenarios [21,22]. In tandem, specialized
hardware accelerators such as HailoAI are being leveraged to shift compute-intensive layers off the
primary processor, substantially boosting real-time performance and system throughput [23]. Such
an approach can also mitigate thermal issues on the main board, which is often a limiting factor in
continuous edge processing applications.</p>
      <p>While past research verifies the feasibility of deploying UAV detection algorithms on
embedded devices, certain limitations remain—specifically, preserving stable real-time inference
under a broad spectrum of environmental factors [24,25]. For instance, abrupt weather shifts or
highly cluttered backgrounds may lead to transient detection drop-offs, unless the underlying
models are robustly trained and systematically tuned. Future work may focus on adaptive model
designs capable of dynamically adjusting to variable scene complexities [26]. This could entail
runtime modifications to network depth or feature resolution, depending on current resource
availability and observed scene complexity. In addition, incorporating multi-modal sensor fusion
(combining image-based recognition with acoustic or thermal inputs) could greatly bolster detection
fidelity across diverse and unpredictable operational conditions [27,28]. By intelligently combining
signals from different sensor types, systems can better isolate drone signatures in poor lighting or
when strong reflections confuse purely optical methods.</p>
    </sec>
    <sec id="sec-3">
      <title>3. The purpose and objectives of the research</title>
      <p>The aim of this research is to enhance the efficiency and accuracy of UAV detection on embedded
systems by optimizing the implementation of YOLOv8 for deployment on the Raspberry Pi 5. This
involves refining the balance between inference speed, resource utilization, and detection
performance.</p>
      <p>To achieve this goal, set the following tasks:




</p>
      <p>Develop an optimized model of YOLOv8 for real-time UAV detection on Raspberry Pi 5
while ensuring minimal computational overhead.</p>
      <p>Conduct a comparative analysis of different YOLOv8 versions, including YOLOv8, YOLOv8n,
YOLOv8 with Selective Knowledge Distillation (SKD), and YOLOv8 accelerated with
HailoAI.</p>
      <p>Implement and evaluate various optimization techniques such as model quantization
(FP16/INT8), pruning, and hardware acceleration to enhance detection performance.
Measure key performance metrics, including mean average precision (mAP), precision,
recall, F1-score, frames per second (FPS), and inference latency, to determine the most
effective deployment strategy.</p>
      <p>Investigate the impact of hardware accelerators, particularly HailoAI, on improving
realtime UAV detection while maintaining energy efficiency on Raspberry Pi 5.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Definition of experiments approaches to learning and optimization.</title>
    </sec>
    <sec id="sec-5">
      <title>Selective Knowledge Distillation.</title>
      <p>
        The central goal of this investigation is to balance speed and accuracy in UAV detection on the
Raspberry Pi 5. To achieve this, need to conduct a series of experiments evaluating four
YOLOv8based configurations under different optimization and hardware scenarios. The chosen
configurations include (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) the standard YOLOv8 model, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) its lightweight counterpart YOLOv8n, (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
YOLOv8 augmented with Selective Knowledge Distillation (SKD), and (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) YOLOv8 accelerated via
the HailoAI hardware and SDK. By studying these contrasting setups, seeking insights into the most
efficient blend of algorithmic refinement and specialized hardware for real-time embedded inference.
      </p>
      <p>
        The experiments are founded on carefully structured metrics, equations, and protocols to
quantify performance from both computational and practical standpoints. For achieving this
measuring detection quality using established object-detection indicators while examining system
resource utilization - namely power consumption, memory footprint, and inference latency.
First, denote T (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) as the total inference time (in milliseconds) for processing a single image. Need to
approximate this time by considering the total number of floating-point operations (FLOPs)
performed (O) and the effective processing frequency (f ) achievable by the Raspberry Pi 5 CPU,
GPU, or HailoAI accelerator:
      </p>
      <p>
        When partial computations are offloaded to the HailoAI processor, f can grow substantially
due to the device’s parallelization capabilities, thereby decreasing T . However, also acknowledge
that additional overhead may arise from device communication and data transfer. Consequently, the
full latency L (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) in a real deployment scenario includes the preprocessing time tpre, the model
inference time tinf, and the post-processing stage tpost:
      </p>
      <p>O
T =</p>
      <p>
        f
L=t pre+tinf +t post
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
Then, record and analyze each of these components for a thorough assessment of any proposed
optimization. Running continuous UAV detection on edge hardware requires to keep a close eye on
total power consumption, and denote Psys (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) as the combined system power:
      </p>
      <p>Psys= Pstatic + PCPU + PGPU + PHailo ,
where Pstatic captures the baseline idle power draw of the Raspberry Pi 5 board, and PCPU, PGPU, and
PHailo signify the incremental power contributions from the CPU, on-board GPU, and the HailoAI
processor, respectively.</p>
      <p>After, further define an energy usage metric E (in Joules) over an interval of time Δt:
Δt
E=∫ Psys (t ) dt .</p>
      <p>0</p>
      <p>
        This integral captures transient power spikes during inference (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ). An inference engine that
boasts a high throughput but also triggers large transient power surges may increase E over time,
which matters for field-deployed UAV systems operating under battery constraints.
      </p>
      <p>
        To evaluate how reliably each model identifies drones, incorporate standard detection
metrics. Mean Average Precision (mAP) (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) gauges overall detection quality by integrating Average
Precision (AP) across each class i (where i ∈ {1, … ,N}):
mAP= 1 ∑N A Pi (
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
      </p>
      <p>
        N i=1
In UAV detection, there is a single drone class. AP itself (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) is derived from a precision-recall curve:
1
A Pi=∫ max∈ [ rcurrent ,1](≺( r )) dr
      </p>
      <p>0
where Rrec( r ) is the measured precision at a given recall levelr.</p>
      <p>
        Have to additionally record:
Precision (Prec) and Recall (Rec) (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ):
where TP, FP, FN are true positives, false positives, and false negatives, respectively.
F1-score, which balances precision and recall in a single metric (8):
Frames per Second (FPS), a real-time performance measure (9):
      </p>
      <p>FPS= 1</p>
      <p>
        Tsec
where Tsec=T / 1000 is the inference time in seconds.
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
(8)
(9)
Together, these metrics provide a balanced view, enabling us to judge how each optimization
procedure - be it structural (e.g., pruning, SKD) or hardware-based (e.g., HailoAI offload) - influences
both predictive accuracy and computational load.
      </p>
      <p>Experiments started with the unmodified YOLOv8 model serving as a baseline. This architecture
offers strong detection accuracy out of the box but can be heavy in terms of floating-point
operations. First, collect baseline L, FPS, and mAP measurements on the Raspberry Pi 5 without any
specialized compression or hardware acceleration. This measurement acts as a reference against
which compared subsequent variations.</p>
      <p>The second setup involves the YOLOv8n model, deliberately designed with fewer layers and smaller
convolutional channels. Its structure slashes the FLOP count (10), so expected faster inference:</p>
      <p>On=γ ⋅ Obase , 0&lt; γ &lt;1 ,
where Obase indicates the FLOPs in the baseline YOLOv8 andγ is a reduction factor reflecting the
degree of compression. A crucial question is whether YOLOv8n still manages to detect fast, small
UAVs effectively or if a more rigorous approach (like distillation) is needed to maintain accuracy.
Selective Knowledge Distillation (SKD) represents a guided transfer of information from a teacher
network to a smaller student model. Unlike conventional knowledge distillation (which attempts to
mimic all teacher outputs equally), SKD pinpoints salient intermediate representations crucial for
UAV detection, such as feature maps capturing motion cues or high-level object boundaries. The
distillation-driven loss term LKD (11) is integrated into the training objective:</p>
      <p>LSKD=α LCE ( y , ŷ )+ β LKD ( pt , ps ) ,
(10)
(11)
where LCE is cross-entropy with ground truth ( y ), and LKD measures how closely the student’s
predicted distributions ps or selected feature maps match those of the teacher pt. Hyperparameters
α and β modulate the relative weight of standard training vs. distillation. By zeroing in on critical
UAV-related features, SKD can reduce the total parameter set and FLOPs while preserving mAP. As
result - documented not only the final detection accuracy but also measure the resulting FPS and L
to confirm the net gain in efficiency.</p>
      <p>The final experiment deploys YOLOv8 - optionally pruned or distilled - on the Raspberry Pi 5 in
tandem with the HailoAI processor. This specialized hardware, supported by Hailo’s SDK, operates
as an edge-based neural inference engine. It can parallelize convolutional layers, drastically boosting
the effective processing frequency ( f )and thus diminishing the per-frame inference time T = O .
f
However, harnessing the HailoAI device requires careful integration:
1. Model Compilation: Had to use the Hailo SDK to parse and compile the YOLOv8
architecture, mapping specific network layers to the HailoAI’s internal dataflow.</p>
      <p>Data Transfer Optimization: Input frames and intermediate tensors must be efficiently
passed between the Raspberry Pi 5 memory and the HailoAI module. High-speed interfaces
(e.g., M.2 or specialized board connectors) mitigate overhead, but the integration pipeline
could introduce additional latencies ( t transfer ).</p>
      <p>Hardware Parallelization: The HailoAI chip exploits parallel compute engines that handle
different segments of the convolution and activation layers concurrently. Let us denote
OHailo as the subset of operations mapped to HailoAI; the CPU may still handle certain
lightweight tasks (12):</p>
      <p>Ototal=OHailo+OCPU ,
(12)
and the effective total inference time (13) might be approximated by:</p>
      <p>T accel ≈ max ( OHailo , OCPU )+t transfer ,</p>
      <p>F Hailo FCPU
(13)
where F Hailo and FCPU are the hardware-specific frequencies or throughput rates, and t transfer
measures any overhead for data exchange.</p>
      <p>Expect this configuration to yield the highest throughput (FPS), but a complete evaluation of
L and Psys is necessary to confirm that these gains are not offset by data movement penalties or
escalated power usage. Plan is to benchmark each step within the pipeline - compilation, data
transfer, inference, and post-processing - to better understand how HailoAI acceleration can be
maximized for UAV detection tasks.</p>
      <p>Implemented these four experimental conditions on an actual Raspberry Pi 5 board outfitted
with the Camera Module 3 and the HailoAI expansion module. The presence of the Camera Module 3
allowed for a consistent and relatively high-quality video feed suitable for capturing diverse UAV
movements, while the HailoAI expansion module provided a dedicated hardware accelerator that
could be selectively engaged or bypassed. During each trial, system metrics such as CPU usage, GPU
load, and memory footprint were tracked to identify possible bottlenecks and evaluate the relative
benefits of offloading computationally intensive tasks.</p>
      <p>For each approach, the dataset encompassed UAV imagery under variable altitudes, lighting
environments, and occlusion levels, aiming to mimic realistic drone operations. To gather
representative examples of aerial scenarios, footage was captured at different times of day, including
dawn and dusk. This ensured that the training and validation sets reflected conditions ranging from
clear skies to moderate cloud cover, as well as the occasional presence of background clutter like
buildings, trees, or other airborne objects. By incorporating such variability, the experiments moved
closer to what might be encountered in real-world deployments at industrial sites, public events, or
rural surveillance zones.</p>
      <p>Then, trained all models for 50 epochs, at a batch size of 16, with an initial learning rate
of 0.001, while adopting standard data augmentations (random rotations, scaling, contrast jitter) to
promote robust generalization. The batch size and learning rate were initially chosen based on prior
benchmarks of resource-constrained platforms, striking a balance between convergence stability and
hardware limitations. Random rotations and scaling ensured the models learned to recognize UAVs
from multiple angles and distances, whereas contrast jitter compensated for variations in ambient
lighting. This combination of training parameters and augmentations proved especially important in
preventing overfitting, allowing the networks to remain adaptive to unexpected changes in altitude
and background complexity.</p>
      <p>Following each training session, run validation to capture the standard detection metrics. In
parallel, real-time inference trials conducted on live camera feeds, measuring FPS and maintaining
logs for system power consumption. Summarizing these recordings will provide a detailed picture of
where each approach excels. Some configurations may demonstrate minimal power draw but reduce
detection fidelity, whereas others might boast superior detection results but carry higher energy and
latency costs.</p>
      <p>In addition, memory usage (RAM) was tracked during inference because edge devices
typically have limited headroom (14).</p>
      <p>M total= M static + M model+ M buffers+ M temp
(14)
where M static is the baseline memory footprint of the operating system, M model is the loaded weights,
M buffers accounts for input/output tensors, and M temp refers to intermediate data structures.
Quantization (FP16 or INT8) can lower M model, which in turn helps accommodate more simultaneous
tasks on the Raspberry Pi 5.</p>
      <p>By investigating these four YOLOv8-based solutions - baseline YOLOv8, YOLOv8n, YOLOv8 +
SKD, and YOLOv8 + HailoAI - aim to identify the ideal compromise among accuracy, speed, and
energy efficiency. The interplay of advanced techniques like Selective Knowledge Distillation and
specialized hardware acceleration represents a promising route for drone detection at the edge,
where computational resources and power budgets are limited. Upon concluding these trials,
compared numerical outcomes to discern the most balanced configuration, offering a valuable
reference for practitioners seeking to deploy real-time UAV surveillance on embedded platforms.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Experiments and Results.</title>
      <p>This section presents an updated set of empirical findings for the four YOLOv8-based configurations,
recalculated using revised, realistic parameters. The experiments were performed on a Raspberry Pi 5
connected to a Camera Module 3, capturing continuous aerial footage of small UAVs in varied
environmental conditions. Each model was evaluated over multiple runs to ensure representative
averages and minimize the influence of transient measurement spikes. Below, summarized the key
metrics - spanning accuracy, speed, and power consumption - and derive several insights into the
realworld feasibility of each approach.</p>
      <p>All models were tested on images with a resolution of 640×640, following standard practice for
UAV detection tasks. Table 1 lists the mean Average Precision (mAP) at IoU thresholds ranging from 0.5
to 0.95, precision, recall, F1-score, and the corresponding inference speed in frames per second (FPS) and
per-frame latency. The data reflects stable performance once the Raspberry Pi 5 and, where applicable,
the HailoAI module had fully initialized. Calculated metrics are shown on Table 1.</p>
      <p>The unmodified YOLOv8 model achieves a high mAP of 87.2% but runs at an average of 7.4 FPS,
which could be insufficient for high-speed drone movements. In practice, such a frame rate may lead
to noticeable delays when tracking small or distant drones that move erratically. Rapid maneuvers,
such as sudden altitude changes, demand higher throughput to avoid missing critical frames that
might contain decisive motion cues.</p>
      <p>YOLOv8n’s reduced architecture reaches 82.6% mAP, losing some accuracy but more than
doubling the frame rate (13.2 FPS) relative to the baseline. This compromise is especially important
for scenarios where continuous aerial surveillance matters more than perfectly identifying every
single drone. Field tests on battery-powered UAVs suggest that a faster refresh rate can improve
situational awareness during unpredictable flight paths, even if marginal accuracy is sacrificed.</p>
      <p>With Selective Knowledge Distillation (SKD), YOLOv8 maintains a strong 85.3% mAP and sees a
moderate boost in inference speed (about 10.1 FPS). This suggests SKD successfully trims
unnecessary parameters while preserving most detection capabilities. In dense airspace conditions—
where multiple drones or other flying objects appear—such a balanced approach can prove valuable
by avoiding the steep trade-offs of an overly pruned model. The HailoAI-assisted setup offers the
fastest inference, hitting 35 FPS, and retains a robust mAP of 86.9%. Its low latency (around 29 ms
per frame) could be vital for near-real-time UAV detection scenarios. Some operators prioritize this
ultra-fast inference to manage large fleets, deter unauthorized drone incursions, or quickly re-target
cameras in dynamic surveillance tasks.</p>
      <p>Beyond speed and accuracy, sustained drone surveillance demands attention to power usage.
Table 2 shows the average system power draw ( Psys) under each YOLOv8 variant. Also provided the
approximate idle power of the Raspberry Pi 5 for reference. These measurements were recorded
over 15-minute intervals to account for warm-up phases and any transient load fluctuations.
Consequently, model configuration should be guided not only by detection proficiency but also by
practical constraints such as battery life and ambient temperature conditions.</p>
      <p>Idle (W)</p>
      <p>Avg. Power (W)</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>The findings of this study underline the delicate balance between performance, resource constraints,
and energy usage when deploying UAV detection on edge hardware such as the Raspberry Pi 5.
Beginning with an in-depth analysis of four YOLOv8-based configurations—ranging from a standard
setup to a lightweight variant, a selective knowledge-distillation approach, and finally, a
hardwareaccelerated scheme using HailoAI - observed distinct trade-offs that can guide practitioners in
making well-informed choices for various application scenarios.</p>
      <p>From an algorithmic standpoint, the experiments confirmed the enduring relevance of model
compression, mainly when operational considerations—such as limited memory and power budgets
—take center stage. YOLOv8n, for instance, demonstrated the efficiency gains that arise from
reducing model complexity, often achieving more than twice the frame rate of the baseline model.
However, the accompanying dip in mean Average Precision (mAP) might not be desirable in settings
where mission-critical tasks demand high detection fidelity. Meanwhile, the integration of Selective
Knowledge Distillation (SKD) offered a middle ground: the distilled model retained much of the
teacher network’s predictive capacity while decreasing inference latency and trimming redundant
computations. This approach can be pivotal in environments where additional power constraints
limit the feasibility of external hardware accelerators.</p>
      <p>On the hardware side, our examination of the HailoAI module underscored the promise of
specialized accelerators to mitigate inference bottlenecks. By parallelizing computationally heavy
layers, HailoAI helped YOLOv8 achieve frame rates that far surpassed those obtained via the
Raspberry Pi 5 CPU and GPU alone, albeit at a modest rise in overall power usage, for organizations
with persistent aerial surveillance—airport perimeter security, wildlife conservation, or emergency
response—such hardware acceleration can unlock new real-time capabilities, enabling the system to
track multiple UAVs simultaneously or detect suspicious aerial objects closer to live video speeds.
Significant drops in frame rate might overshadow marginal gains in mAP if the use case involves
tracking fast-moving drones. Conversely, an extremely lightweight model could undermine
detection reliability when the environment features occlusions, variable lighting, or tiny targets,
common in drone applications. The measured differences in average power draw—ranging from
around six to nine watts in our tests—can also become pivotal in battery-powered setups that must
remain operational for extended periods in remote areas.</p>
      <p>Future research might explore more adaptive or dynamic approaches that continually switch
between models or acceleration strategies based on context. For instance, a system could run a
compact model to achieve high frame rates most of the time and then selectively invoke a more
powerful accelerator for frames where aerial anomalies are detected. Such an approach would merge
energy efficiency with heightened responsiveness in critical moments.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative Al</title>
      <p>During the preparation of this work, the authors utilised ChatGPT to identify and rectify
grammatical, typographical, and spelling errors. Following the use of these tools, the authors
conducted a thorough review and made necessary revisions, and accept full responsibility for the
final content of this publication.
[8] F. F. J. Sadeghi and S. M. Asghari, “Embedded Vision Systems for UAV Detection: Toward</p>
      <p>Efficient Deployments,” Sensors, vol. 21, no. 4, p. 1362, 2021.
[9] D. Merdes, H. Lin, and J. Zhang, “Secure Airspace: AI-Based Drone Monitoring for Airports
and Sensitive Infrastructure,” IEEE Transactions on Intelligent Transportation Systems, vol. 22,
no. 7, pp. 4583–4592, 2021.
[10] G. Hinton, O. Vinyals, and J. Dean, “Distilling the Knowledge in a Neural Network,” arXiv
preprint arXiv:1503.02531, 2015.
[11] R. Girshick, “Fast R-CNN,” Proceedings of the IEEE International Conference on Computer</p>
      <p>Vision, 2015, pp. 1440–1448.
[12] W. Liu et al., “SSD: Single Shot MultiBox Detector,” Lecture Notes in Computer Science, vol.</p>
      <p>9905, 2016, pp. 21–37.
[13] Y. Lin, X. Chen, and J. Luo, “Embedded Deep Learning for Drone Surveillance on Low-Power</p>
      <p>Platforms,” ACM Transactions on Cyber-Physical Systems, vol. 4, no. 3, 2020.
[14] V. B. István and J. Ulrich, “Comparative Analysis of Drone Detection Techniques in
Highrisk Environments,” Electronics, vol. 9, no. 3, p. 472, 2020.
[15] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” IEEE Transactions on Pattern</p>
      <p>Analysis and Machine Intelligence, vol. 42, no. 2, pp. 386–397, 2020.
[16] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image</p>
      <p>Recognition,” International Conference on Learning Representations (ICLR), 2015.
[17] A. D. Marquez and M. Kumar, “A Study of Model Compression Techniques for Real-Time</p>
      <p>Drone Detection,” Journal of Signal Processing Systems, vol. 93, no. 3–4, pp. 315–328, 2021.
[18] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” Proceedings of the IEEE</p>
      <p>Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6517–6525.
[19] W. Lee, T. Kim, and G. Park, “Enhancements to YOLO for Resource-Constrained Drone</p>
      <p>Detection,” Sensors, vol. 21, no. 2, p. 489, 2021.
[20] X. Wang, L. Lu, and D. Chen, “Refined YOLO Architectures for High-Speed UAV</p>
      <p>Recognition,” Computer Vision and Image Understanding, vol. 207, pp. 120–130, 2021.
[21] C. Zhao, E. Li, M. Zhou, and B. Zheng, “Pruning Techniques for Embedded Object Detection</p>
      <p>Models,” Neurocomputing, vol. 421, pp. 263–274, 2021.
[22] H. Wang, P. Zhang, and X. Liu, “Evaluation of Quantized YOLO Models on Raspberry Pi,”</p>
      <p>Electronics, vol. 10, no. 15, p. 1794, 2021.
[23] Hailo Technologies, “Hailo-8™: High-Performance, Low-Power AI Processor for Edge</p>
      <p>Devices,” Hailo White Paper, 2021.
[24] A. Sharma, B. Singh, and C. K. Mohan, “Robust Drone Detection Using Domain-Adaptive</p>
      <p>CNNs,” Pattern Recognition Letters, vol. 143, pp. 27–34, 2021.
[25] M. Ghazal, O. Ouhsain, and A. Amira, “Comparative Assessment of Edge AI Frameworks for</p>
      <p>UAV Detection,” IEEE Access, vol. 9, pp. 105842–105852, 2021.
[26] Y. Shin, J. Park, and H. Choi, “Adaptive Network Architectures for Real-Time Drone
Tracking on Embedded Platforms,” IEEE Transactions on Vehicular Technology, vol. 70, no. 8,
pp. 8315–8324, 2021.
[27] S. Y. Zhu, Y. Chen, and R. Zhang, “Sensor Fusion of Visual and Acoustic Data for UAV</p>
      <p>Identification,” Sensors, vol. 22, no. 1, p. 114, 2022.
[28] N. Boyko, L. Mochurad, Y. Kryvenchuk, “Modeling of the information system for processing
of a large distilled data for the investigation of competitiveness of enterprises”, Proceedings
of the 4th International conference on computational linguistics and intelligent systems, P.
964–978, 2020.
[29] B. T. Smith, A. M. Johnson, and M. L. Burch, “Thermal-Enhanced Drone Detection in
Complex Environments,” Electronics, vol. 10, no. 4, p. 409, 2021.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Freedman</surname>
            ,
            <given-names>D.H.</given-names>
          </string-name>
          ,
          <article-title>“Are Drones the Future of Warfare?</article-title>
          ,” Scientific American,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Endsley</surname>
          </string-name>
          , “From Defensive to Offensive: Integrating Drone Technology in Security Missions,
          <source>” IEEE Security &amp; Privacy</source>
          , vol.
          <volume>19</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>52</lpage>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Neven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Chan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Anselmo</surname>
          </string-name>
          , “
          <article-title>Privacy Implications of Civilian Drone Use: Best Practices and Regulatory Guidelines,”</article-title>
          <source>Information &amp; Communications Technology Law</source>
          , vol.
          <volume>29</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>37</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B. V.</given-names>
            <surname>Moeslund</surname>
          </string-name>
          and G. Thomas, “Optical Drone Detection Methods: A Survey,”
          <source>International Journal of Remote Sensing</source>
          , vol.
          <volume>41</volume>
          , no.
          <issue>10</issue>
          , pp.
          <fpage>3896</fpage>
          -
          <lpage>3914</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          , and G. Hinton, “Deep Learning,
          <source>” Nature</source>
          , vol.
          <volume>521</volume>
          , pp.
          <fpage>436</fpage>
          -
          <lpage>444</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>K. I.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Yeon</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Kang</surname>
          </string-name>
          , “
          <string-name>
            <surname>Edge</surname>
            <given-names>AI</given-names>
          </string-name>
          :
          <article-title>Deploying Deep Neural Networks on LowPower Embedded Devices</article-title>
          ,
          <source>” IEEE Internet of Things Magazine</source>
          , vol.
          <volume>3</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>14</fpage>
          -
          <lpage>19</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Krizhevsky</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Hinton</surname>
          </string-name>
          , “
          <article-title>ImageNet Classification with Deep Convolutional Neural Networks,” Communications of the ACM</article-title>
          , vol.
          <volume>60</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>84</fpage>
          -
          <lpage>90</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>