<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Elephant Detection near Railway Tracks using an Ensemble Approach of SSD and YOLO Model</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dechen Doma Bhutia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Swarup Das</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rakesh Kumar Mandal</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Technology, University of North Bengal</institution>
          ,
          <addr-line>Siliguri, West Bengal</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>These days, there is growing interest on the topic of Elephant detection. Elephant movement is observed near Railway tracks. Elephants are occasionally seen attempting to save their lives close to railroads. In order to minimize Elephant Casualties on railway tracks an automated alarming system can be designed based on IoT and AI. This research presents a method for detecting Elephant near railway track and initiates an alarm to drive away Elephants from the railway tracks. Here, a Raspberry Pi that works with the ensemble model of SSD and YOLO has been used. The ensemble approach demonstrates high precision, recall, and mAP, along with real-time processing capability. These metrics validate the system's efectiveness in reducing elephant casualties near railway tracks and highlights its potential for deployment in real-world scenarios. The system was trained with Kaggle dataset.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;SSD</kwd>
        <kwd>YOLO</kwd>
        <kwd>Ensemble Model</kwd>
        <kwd>Raspberry Pi</kwd>
        <kwd>Elephant Detection</kwd>
        <kwd>Kaggle</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>Background</title>
        <p>Elephants are big, sluggish creatures that usually scour enormous areas for food, water, and migration
routes. Railway lines and many elephant habitats overlap, especially in nations like India, Sri Lanka,
and portions of Southeast Asia. Elephants and trains colliding in these regions has become a major
problem, resulting in both passenger and elephant deaths. The expansion of roads, railroads, and
human populations causes elephant habitats to become more fragmented. Some areas have railroad
tracks that pass right through elephant routes and woodlands. Elephant movements can be reasonably
predictable in some places, and they are known to follow regular routes. However, their behavior can
be unpredictable when they feel threatened or startled, making it challenging to prevent collisions.
Traditional methods of detecting elephant movements, such as manual surveillance or infrared cameras,
have limitations. These methods are not always real-time or responsive enough to prevent accidents.</p>
        <p>
          In order to provide a more eficient and responsive system, a hybrid AI method blends several AI
approaches, such as machine learning, computer vision, sensor networks, and data fusion. Elephant
deaths in these high-risk areas can be considerably decreased by eficient real-time detection. Elephant
identification is one area of wildlife monitoring where machine learning models, especially those based
on computer vision, have become increasingly popular. Two of the most popular object detection
methods in the field of wildlife monitoring You Only Look Once (YOLO) and are Single Shot Multibox
Detector (SSD) [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Literature survey on Elephant detection near railway tracks using an ensemble
approach of SSD and YOLO models focuses on existing literature that explores the application of
ensemble approaches using SSD and YOLO for elephant detection near railway tracks. An ensemble
approach that combines the strengths of multiple models shows promising improvements in accurate
detection, eficiency and robustness [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Literature Survey</title>
        <p>
          Geethanjali et al. (2024) presents the MobileNet-SSD V2, a novel automated wildlife detection
system that processes photos for real-time animal detection using a Convolutional Neural Network
(CNN). The study describes a thorough process that uses TensorFlow Lite for on-device inference, from
dataset curation to model training and deployment [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. W.Xue et al. (2017) states that, a system
that leverages the ESP32-CAM platform in conjunction with the YOLOv8 object-detection model is
designed and implemented, [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Sibusiso et al.. proposed a model that incorporates an enhanced
StemBlock and Mobile Bottleneck Block modules to reduce computing costs for model parameters and
lfoating-point operations (FLOPs) for the backbone. In addition BiFPN-based neck and Focal-EIoU as a
loss function to measure the correctness of the predicted bounding boxes during inference is also used,
[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] Hussain, M (2023) presented a paper that uses the most recent iteration (YOLO-v8). The main
architectural innovations suggested at each iteration are examined in the review, which is followed by
industrial deployment examples for surface defect detection that support the technology’s suitability for
industrial needs, [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] Wei Liu(2016) states that the output space of the bounding boxes is discretized
by SSD into a series of default boxes across various aspect ratios and scales according to the location of
the feature map. The network creates scores for each object category’s existence in each default box
during prediction time and modifies the box to better fit the shape of the object. In order to naturally
manage objects of diferent sizes, the network also integrates predictions from several feature maps with
varying resolutions,[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] Yu- Chen Chiu et al. (2020) presented a paper that is based on Mobilenet-v2,
this lightweight object detection model can be used in embedded devices with constrained processing
resources and achieved up to 75.9% mAP with the VOC dataset [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. P. F. Felzenszwalb et al. (2010)
presented a multi scale deformable component model mixture-based object detection method. The
approach delivers best results in the PASCAL object detection challenges and can represent extremely
varied object classes [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. According to A. Biglari et al. (2022), the main objective of the proposed
approach is to create a system that can identify unusual animals by automatically extracting visual
attributes from the training set. An image capture and preprocessing module, which analyzes images in
real-time to lower noise and improve recognition accuracy, is one of the system’s essential parts. A
module for identifying target uncommon creatures inside photos is also included [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Bijuphukan
Bhagabati et al. (2024) presented a paper that is based on artificial intelligence (AI) are used to identify
wild animals from live video footage, issue alerts to prevent interactions, and safeguard both people and
animals. Real-time wild animal recognition is achieved using YoloV5 along with the SENet attention
layer and deep learning models [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Yuvaraj Munian et al. (2022) suggests a clever solution for
night time animal detection that combines a convolutional neural network (CNN) and the Histogram
of Oriented Gradients (HOG) technique. A range of CNNs, including basic CNN and VGG16-based
CNN, as well as machine learning algorithms, including Random Forest (RF), Support Vector Machine
(SVM), Linear Regression (LR), Decision Tree Algorithm (DT) and Gaussian Naïve Bayes (GNB), are
used to benchmark the suggested intelligent system [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Zeyu Xu et al. (2024) provides a literature
review of methods for animal detection in aerial and satellite images using deep learning. The final
results show that Faster R-CNN, YOLO, ResNet and U-Net are the most used neural network structures
[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Sugumar et al. (2014) suggests an unsupervised automated elephant image identification system
(EIDS) as a remedy for human-elephant conflict. An RF network is used to transmit the elephant’s image
to a base station once it is taken in the forest border zones. In order to extract picture features and
compare the query image of the elephant and the image in the database using image vision algorithms,
the received image is decomposed using the Haar wavelet to produce multilayer wavelet coeficients
[14]. D. Yudin et al. (2019) deals with the challenge of detecting large animals on the road. The
specialized data used by them along with various neural networks, using YOLOv3 to achieve an mAP
of 0.78 and 35 fps for 10 animal classes, makes way for improved safety on the roads[15]. Gupta et
al. (2022) Proposed several deep learning-based models to recognize elephants in pictures and videos.
For rhino detection, a number of models based on convolutional neural networks (CNNs) and three
models based on transfer learning (TL), ResNet50, MobileNet, and Inception V3—have been tested and
optimized [16]. N. Mamat et al. (2022) used the YOLOv5 approach to identify four types of animals
that are frequently found in farming regions. With a cross stage partial network (CSP) as its backbone,
YOLOv5 can produce detections with excellent accuracy [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Patel, D., Sharma, S. (2022). suggested
that the optimum model for real-time elephant detection has been determined to be YOLOv3. When it
comes to classification performance,YOLOv3 outperforms
SSD_eficientdets_d0_512 × 512.[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>Sections of this Paper</title>
        <p>The diferent sections of this paper are as follows : Section 1 deals with the Introduction along with
the objective and literature review , Section 2 deals with Data Acquisition, Section 3 describes the
Methodology of the proposed system, Section 4 deals with complete Result Analysis of individual as
well as the ensemble model and finally Section 5 deals with Conclusion derived.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Data Acquisition</title>
      <p>Image data of Asian Elephants has been acquired and stored in a directory for training.A repository
of 5 thousand images has been acquired from the link
kaggle.com/datasets/gunarakulangr/sri-lankanwild-elephant-dataset". The dataset obtained from Kaggle contains diferent single and group images
[17] . These images are preprocessed to form a standard frame and only those images are kept for
training that contains a single elephant image. No image annotation is needed because there are more
than 4 thousand such images. These images are enough for training and testing. Figure 1 displays the
ifltered images.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>
        Models that can efectively and precisely identify big, moving animals in a range of environmental
circumstances are necessary for elephant detection. Acoustic sensors, infrared cameras, and manual
surveys are examples of traditional techniques. However, these approaches have a number of drawbacks,
including expensive manpower, delayed reaction times, and environmental issues like bad weather
or visibility. Consequently, computer vision-based automated solutions have grown in significance.
Modern object detection methods like SSD and YOLO have been efectively used in a number of
domains, including wildlife monitoring. Convolutional Neural Networks (CNNs) and deep learning are
the foundations of these models, which enable them to recognize intricate patterns in images and learn
spatial hierarchies [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>Single Shot Multibox Detector (SSD):</title>
        <p>SSD is a real-time, quick, and efective model for object detection. In contrast to conventional object
identification systems (such Faster R-CNN) that rely on region suggestions, SSD makes predictions for
numerous bounding boxes in a single network run, hence the term single shot." Elephant detection
benefits from SSD’s reputation for striking a compromise between speed and precision, which makes it
appropriate for real-time applications like train monitoring systems.</p>
        <p>Rapid image processing and the ability to identify several things in a scene are essential in dynamic
settings like railroad tracks. Although SSD is good at recognizing little items and doing so quickly, it
may have trouble spotting big creatures like elephants in crowded or cluttered areas.</p>
      </sec>
      <sec id="sec-3-2">
        <title>YOLO (You Only Look Once)</title>
        <p>One well-liked deep learning model for real-time object recognition is called YOLO (You Only Look
Once). It is made to recognize items in pictures and videos, quickly and accurately determining each
object’s location and category. YOLO is efectively used for real time applications, as it has the capability
to process an entire image in a single pass, unlike older object detection methods that required multiple
passes through the image. The ability of YOLO to simultaneously predict multiple bounding boxes
and their corresponding class probabilities using a single CNN, makes the model highly eficient as
compared to other region-based methods (like Fast R-CNN, R-CNN etc.). The input image is divided by
YOLO into grids. Each of the grid cell predicts Bounding boxes (coordinates: height, width and centre),
Confidence score (how likely it is that a bounding box contains an object) and Class probabilities (the
likelihood of each class being present). YOLO is highly trained to predict class labels and bounding boxes
from raw images, without requiring separate components for feature extraction, object localization, or
classification. The latest version is YOLOv8 (2023) which focuses on further optimizations and improved
performance providing pre-trained models for various tasks such as segmentation, detection, etc. The
areas where YOLO is applied are Surveillance cameras to identifying suspicious activities, Detecting
vehicles, pedestrians and trafic signs, In medical image analysis, detecting anomalies such as tumors or
fractures in the body and object tracking and inventory management. The advantage of using YOLO
is that it is extremely fast and can process video streams in real-time, making it ideal for live object
detection. It is simple in nature as it uses a single neural network for all tasks, simplifying deployment
and YOLO often yields good results even when detecting small or overlapping objects thereby providing
high quality predictions. However it has some short comings as YOLO is not efective in detecting
small objects in comparison to other models like faster R-CNN and YOLO can sometimes produce less
precise bounding boxes, especially in crowded scenes or with overlapping objects thereby giving rise to
localization errors.</p>
      </sec>
      <sec id="sec-3-3">
        <title>An Ensemble Approach of YOLO and SSD</title>
        <p>Using an ensemble approach of SSD (Single Shot Multibox Detector) and YOLO (You Only Look Once)
can combine the strengths of both models, leading to an improved object detection performance in a
variety of scenarios[16]. Each model has its own advantages and disadvantages, and combining both
can help to address these limitations. Some of the advantages of using an ensemble of YOLO and SSD
are improved accuracy. YOLO and SSD each have strengths in diferent areas of object detection. YOLO
is known for its speed and ability to detect large objects, but it might struggle with small objects or
complex scenes whereas SSD on the other hand, can perform much better in detecting small objects
and in situations where the objects are densely packed. Combination of these models can lead to a
more balanced and accurate model, where each model handles the parts of the image it’s most suited
for. An ensemble can aid increase generalization and lessen over fitting by combining the outputs of
both models, particularly on heterogeneous datasets with diferent object sizes and backgrounds. Errors
such as false positives and false negatives can be decreased with the use of an ensemble approach. The
overall performance may be improved even if one model makes a mistake because the other model
might still produce the right answer. While certain photos or settings may be dificult for a single model
to handle, the ensemble may be able to manage these edge cases more successfully by merging the
output of both YOLO and SSD. For instance, YOLO may perform better than SSD in identifying huge
things in clear situations, whereas SSD may be better at identifying little objects. With its simplified
design, YOLO is faster and can be used to accelerate real- time applications.</p>
        <p>
          The ensemble can concentrate SSD on more dificult regions that require more thorough detection by
employing YOLO as a preliminary pass to swiftly identify conspicuous objects. SSD can handle dense
sceneries more skillfully and ofer greater localization, albeit being a little slower. By concentrating
on high-accuracy localization, SSD can enhance the outcomes of YOLO predictions when used in
conjunction with YOLO [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].In simpler settings, where the network can more precisely predict bounding
boxes, YOLO’s detection typically works better with larger items. YOLO performs efectively when
objects take up a large amount of the image because of its grid-based prediction, SSD is generally more
sensitive to tiny items since it makes predictions using several feature maps at various scales. You may
take use of SSD’s capacity to identify smaller things and YOLO’s capacity to capture larger ones by
combining the two models. Diferent backbone networks (such as Darknet, ResNet, and VGG) can be
added to both YOLO and SSD, potentially ofering diferent trade-ofs in terms of feature representation
and extraction [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The system can benefit from the advantages of several architectures by utilizing an
ensemble of these models with various backbones. You may run both models in parallel, processing
distinct areas of the image (or the same regions but with diferent detection tasks) because YOLO is
incredibly quick and SSD isn’t all that much slower. One model may have trouble in complex situations,
while the other can make up for it, resulting in a small loss of detection time. You can balance the
computational burden by dividing the detection duty between YOLO and SSD. This way, one model
won’t be overloaded with challenging cases while the other can function more rapidly. Relying on
both quick (YOLO) and more accurate (SSD) models provides flexibility in applications where speed
is crucial, such as robotics, autonomous cars, or security monitoring. The ensemble approach can be
modified to give priority to accuracy in some situations and speed in others. In real-time, the system may
dynamically pick between YOLO and SSD depending on the specific context (e.g. real-time detection in
dynamic surroundings), or both might run in parallel, with their findings combined for more thorough
predictions. By combining the output from both models, an ensemble allows you to calculate a more
reliable confidence score. For instance, you can accept the forecast with confidence if both models
agree on the object’s detection and classification. The confidence score can be changed or marked for
additional examination if there is a diference between the models.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Algorithm for Ensemble of SSD and YOLO</title>
        <p>Step 1: Load the SSD and YOLO models’ pre-trained weights. This stage is predicated on the use of a
framework, such as PyTorch or TensorFlow, where loading these models is simple.
Step 2: Preprocess the input image to ensure that it is in the format that both models require. Both
the SSD and YOLO models typically assume that the input image has been adjusted and shrunk.
Step 3: To obtain predictions, run both models on the previously processed image. Bounding boxes,
class labels, and confidence scores will be produced by both models.</p>
        <p>Step 4:- Use Non-Maximum Suppression (NMS) to eliminate unnecessary bounding boxes for both
SSD and YOLO. Lower-confidence boxes that significantly overlap with higher-confidence ones will
be eliminated as a result.</p>
        <p>Step 5: Combine the bounding boxes, class labels, and confidence scores from the SSD and YOLO
results. Concatenate or merge the bounding boxes from the two models to combine the detections.
To give one model’s predictions more weight than the other, you can either use weighted scores or a
secondary NMS
Step 6: Apply a final NMS on the combined bounding boxes from both models in order to eliminate
duplicate detections, or boxes with overlapping predictions
Step 7: Following the final NMS, return or display the final bounding boxes, class labels, and confidence
ratings.</p>
        <p>Step 8: If necessary, carry out any further post-processing, such as modifying the bounding box
coordinates or applying a confidence threshold to eliminate predictions with low confidence.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Result Analysis</title>
      <p>The performance metrics for an ensemble of SSD and YOLO models, the calculations can be done
using the provided values for True Positives (TP), False Positives (FP), False Negatives (FN), and True
Negatives (TN). Precision, Recall, F1 Score, Mean Average Precision (mAP), and Inference Time for
each model (YOLO, SSD, and their ensemble).</p>
      <p>• True Positives (TP) = 700
• False Negatives (FN) = 100
• False Positives (FP) = 60
• True Negatives (TN) = 1140
• Total dataset = 5000</p>
      <sec id="sec-4-1">
        <title>Precision, Recall, and F1 Score Calculation</title>
        <p>Precision measures how many of the predicted positives are actually correct.</p>
        <p>=</p>
        <p>+  
Recall measures how many of the actual positives were correctly identified.</p>
        <p>=</p>
        <p>+  
The F1 score is the mean of Precision and Recall.</p>
        <p>* 
 1 = 2 *   + 
Mean Average Precision (mAP) is calculated based on class-wise performance and IoU thresholds.</p>
        <p>Inference time depends on the model architecture SSD is slower than YOLO, as YOLO is a single-stage
detector,but the ensemble model will have a bit longer inference time as it combines the predictions of
both models.</p>
        <p>The interference time of both the model as well as their ensemble model were as follows:
YOLO: = 50ms per image
SSD: = 60ms per image</p>
        <p>Ensemble (YOLO + SSD): = 110 ms per image.</p>
        <p>Therefore for the Ensemble Model,
  =
 =
 1 = 2 * 00..992211 +*00..887755 = 2 * 10..789064 = 0.0.896
Summary of Results of the Ensemble Model
• Precision = 0.921
• Recall = 0.875
• F1 Score = 0.896
• mAP = 0.890
• Inference Time = 110ms per image</p>
      </sec>
      <sec id="sec-4-2">
        <title>For Individual Models YOLO SSD mode</title>
        <p>:= 0.90</p>
        <p>:= 0.80
 1 := 2 * 00..9900 +*00..8800 = 2 * 10..7702 = 0.849
 := 0.75(    ℎ  0.5)
  ( ) := 50  
 () := 0.85
() := 0.85
The ensemble of SSD and YOLO shows a higher F1 Score of 0.896 than the individual models, indicating
better overall performance. Precision is highest in the Ensemble model, the Ensemble model’s Inference
Time is longer than either SSD or YOLO individually, which is a trade-of for improved accuracy.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.1. Ablation Study for Evaluating the Eficacy of the Ensemble Model and</title>
      </sec>
      <sec id="sec-4-4">
        <title>Overfitting</title>
        <p>
          An ablation study was performed to isolate the contributions of diferent components in the ensemble
model (YOLO and SSD)[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and evaluate its overall performance. To help understand the efectiveness of
the ensemble approach and identify potential over fitting.
        </p>
      </sec>
      <sec id="sec-4-5">
        <title>Experimental Setup</title>
        <p>Dataset:</p>
        <p>Training and evaluation are conducted on Kaggle dataset for elephant detection, divided into:
  := 70%
  := 15%</p>
        <p>:= 15%
Metrics:</p>
        <p>Precision, Recall, F1 Score, mAP, and Inference Time ( As mentioned in the Table 1.)</p>
        <sec id="sec-4-5-1">
          <title>Baseline Models: YOLO-only ,SSD-only, Ensemble (YOLO + SSD) Observations</title>
          <p>Reduces over fitting by leveraging the strengths of both YOLO and
SSD.</p>
          <p>Demonstrates better generalization across heterogeneous datasets.</p>
          <p>Slightly higher inference time than YOLO-only but still suitable for
real-time applications. No significant over fitting observed due to
complementary model strengths.</p>
          <p>The above study highlights the eficacy of the ensemble model in achieving higher accuracy, recall,
and mAP compared to individual models. The ensemble approach efectively mitigates over fitting by
combining the complementary strengths of YOLO and SSD, making it a robust solution for elephant
detection near railway tracks.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>
        The approach here in this research work is to develop a image capturing system of the wild animals
specially Elephants installed near the railway tracks based on an ensemble approach of SSD and YOLO
model for the identification of the elephants [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and if output is positive, alarm is raised. For having
been trained on Kaggle dataset [17] the ensemble system achieved precision of 92.1%, recall of 87.5%,
an F1 score of 89.6%, a mAP of 89.6%, and an inference time of 110 ms per image, making the system
suitable for real-time applications. The ensemble approach reduces errors by combining the outputs of
SSD and YOLO [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] .The ensemble approach is also capable of identifying elephants at varying distances
with high confidence and, the system is adaptable to diverse environmental conditions like low visibility,
cluttered backgrounds, and dense scenes, ensuring consistent performance.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <sec id="sec-6-1">
        <title>The author(s) have not employed any Generative AI tools.</title>
        <p>animals in aerial and satellite images, in: R. N. Smythe, A. Noble (Eds.), International Journal of
Applied Earth Observation and Geo information, volume 128 of LAC ’10, 2024, pp. 1569–8432.
doi:10.1016/j.jag.2024.103732.
[14] Sugumar, S. . J. . R.., An improved real time image detection system for elephant intrusion along
the forest border areas, The Scientific World Journal (2014). doi: 10.1155/2014/393958.
[15] D.Yudin, A.Sotnikov, A.Krishtopik, Detection of big animals on images with road scenes using deep
learning, in: International Conference on Artificial Intelligence: Applications and Innovations
(ICAIAI), volume 3, Belgrade, Serbia, 2019, pp. 100–103. doi:10.1109/IC-AIAI48757.2019.00028.
[16] Gupta, S. Mohan, N. Nayak, P. et al, Deep vision – based surveillance system to prevent
train–elephant collisions (2022). doi:10.1007/s00500-021-06493-8.
[17] Wild elephant dataset, 2024. URL: https://www.kaggle.com/datasets/gunarakulangr/
sri-lankan-wild-elephant-dataset, [Accessed in May 15, 2024].</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>P. D.</surname>
            ,
            <given-names>S. S.</given-names>
          </string-name>
          ,
          <source>Automated detection of elephant using ai techniques</source>
          , Springer-Verlag
          <volume>404</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-19-6406-04.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Mamat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Othman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Yakub</surname>
          </string-name>
          ,
          <article-title>Animal intrusion detection in farming area using yolov5 approach</article-title>
          , in: 22nd International Conference on Control,
          <source>Automation and Systems</source>
          , Jeju, Korea,
          <year>2022</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          . doi:
          <volume>10</volume>
          .23919/ICCAS55662.
          <year>2022</year>
          .
          <volume>10003780</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G. P</given-names>
            ,
            <surname>M. Nivin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rajeshwari</surname>
          </string-name>
          ,
          <article-title>Advances in ecological surveillance: Real- time wildlife detection using mobilenet-ssd v2 convolutional neural network</article-title>
          ,
          <source>IJRASET Journal For Research in Applied Science and Engineering Technology</source>
          <volume>11</volume>
          (
          <year>2024</year>
          )
          <fpage>2333</fpage>
          -
          <lpage>2345</lpage>
          . doi:
          <volume>10</volume>
          .22214/ijraset.
          <year>2023</year>
          .
          <volume>57847</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W.</given-names>
            <surname>Xue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <article-title>Animal intrusion detection based on convolutional neural network</article-title>
          ,
          <source>in: 17th International Symposium on Communications and Information Technologies (ISCIT)</source>
          , Cairns,
          <string-name>
            <surname>QLD</surname>
          </string-name>
          , Australia,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          . doi:
          <volume>10</volume>
          .1109/ISCIT.
          <year>2017</year>
          .
          <volume>8261234</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B. T.</given-names>
            <surname>Sibusiso Reuben</surname>
          </string-name>
          <string-name>
            <surname>Bakana</surname>
          </string-name>
          , Yongfei Zhang, Wildare- yolo:
          <article-title>A lightweight and eficient wild animal recognition model</article-title>
          ,
          <source>in: Ecological Informatics</source>
          , volume
          <volume>80</volume>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .1016/j.ecoinf.
          <year>2024</year>
          .
          <volume>102541</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hussain</surname>
          </string-name>
          ,
          <article-title>Yolo-v1 to yolo-v8, the rise of yolo and its complementary nature toward digital manufacturing and industrial defect detection</article-title>
          ,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .3390/machines11070677.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>W.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Anguelov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Erhan</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. S. S. R. C.-Y. Fu</surname>
            ,
            <given-names>A. C.</given-names>
          </string-name>
          <string-name>
            <surname>Berg</surname>
          </string-name>
          , Ssd: Single shot multibox detector, Springer International Publishing (
          <year>2016</year>
          ). doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -46448-02.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.-C.</given-names>
            <surname>Chiu</surname>
          </string-name>
          , C.-Y. Tsai,
          <string-name>
            <surname>M.-D. Ruan</surname>
            , G. Shen,
            <given-names>T.-T.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>An improved object detection model for embedded systems</article-title>
          , in: R. N.
          <string-name>
            <surname>Smythe</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Noble (Eds.),
          <source>International Conference on System Science and Engineering (ICSSE)</source>
          , Paparazzi Press, Cairns,
          <string-name>
            <surname>QLD</surname>
          </string-name>
          , Australia,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>D. M. P. F. Felzenszwalb</surname>
            ,
            <given-names>R. B.</given-names>
          </string-name>
          <string-name>
            <surname>Girshick</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Ramanan</surname>
          </string-name>
          ,
          <article-title>Object detection with discriminatively trained part-based models 32 (</article-title>
          <year>2010</year>
          ). doi:
          <volume>10</volume>
          .1109/TPAMI.
          <year>2009</year>
          .
          <volume>167</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Biglari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <article-title>A vision-based cattle recognition system using tensorflow for livestock water intake monitoring</article-title>
          ,
          <source>IEEE</source>
          <volume>6</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1109/LSENS.
          <year>2022</year>
          .
          <volume>3215699</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>K. C. B. Bijuphukan Bhagabati</surname>
          </string-name>
          , Kandarpa Kumar Sarma,
          <article-title>An automated approach for human-animal conflict minimisation in assam and protection of wildlife around the kaziranga national park using yolo and senet attention framework</article-title>
          ,
          <source>Ecological Informatics</source>
          <volume>79</volume>
          (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .1016/j.ecoinf.
          <year>2023</year>
          .
          <volume>102398</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>D. M. H. H. . M. A. Yuvaraj</surname>
            <given-names>Munian</given-names>
          </string-name>
          , Antonio Martinez-Molina,
          <article-title>Intelligent system utilizing hog and cnn for thermal image-based detection of wild animals in nocturnal periods for vehicle safety</article-title>
          ,
          <source>Applied Artificial Intelligence</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1080/08839514.
          <year>2022</year>
          .
          <volume>2031825</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Skidmore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lamprey</surname>
          </string-name>
          ,
          <article-title>A review of deep learning techniques for detecting</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>