<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>R. K. Mandal)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>An Eficient Elephant Detection Strategy using Visual Attention Network (VAN) in Custom Dataset improved YOLOv7 Model</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rabin Kumar Mullick</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rakesh Kumar Mandal</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of North Bengal</institution>
          ,
          <addr-line>Raja Rammohanpur, Darjeeling, West Bengal</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>9</fpage>
      <lpage>0009</lpage>
      <abstract>
        <p>Manual detection is crucial for managing human-elephant conflict, especially on roads or railway or human localities. Cloud-based elephant detection using YOLOv7 could mitigate conflict and provide surveillance for elephant transgression. Combining units could address larger issues. To address this issue, two key tasks are proposed: First: Detecting elephants on the railway track or highway or nearby human localities using improved YOLOv7 model, which is integration of Visual Attention Network (VAN) layer with the YOLOv7 to recognize elephants in real-time. Second: Notifying the relevant authorities. This paper examines the efectiveness of object detection methods for identifying elephants, focusing on diferent variants belonging to YOLO (You Only Look Once). After comparing these versions, the improved YOLOv7 model demonstrated superior performance on a custom Elephant Detection (ED) dataset. The model was trained on a combination of free and custom elephant datasets, with cloud-based cameras capturing images from multiple locations. The model attained an impressive validation accuracy of 97%.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Elephant Detection</kwd>
        <kwd>Google CoLab</kwd>
        <kwd>YOLOv7</kwd>
        <kwd>Visual Attention Network (VAN)</kwd>
        <kwd>Webcams</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Initial investigations indicate most collisions take place in particular ’hotspot’ areas where elephant
pathways cross roads or railway tracks. Often, these elephant/vehicle collisions occur because drivers
do not have enough time to react at sharp turns, at night driving or under adverse weather conditions.
A vision-based detection system was designed and tested in a prototype early warning system to
address this problem. Driven by the initial results, detection accuracy is shown to be satisfactory under
extremely varying lighting conditions under the assumption of having extensive training datasets that
capture many challenging scenarios. The prototype has been shown to be robust and reliable as a whole.</p>
      <p>This paper is structured as follows: Section 1 outlines the introduction, Section 2 reviews related
works, while Section 3 introduces the proposed system with its components, while Section 4 presents
the experimental results, and finally, Section 5 represents conclusion followed by references.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>
        Ecological balance depends on the presence of wild animals in the Earth. Many studies have been
done in this topic yet more studies are still needed. One such problem is the problem of animal vehicle
collisions (AVCs), a serious problem for biodiversity, which Saxena A., et al proposed for an intervention
that uses deep learning techniques for wildlife detection and avoidance of collisions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Work near
the railway tracks also done and a model is proposed to detect elephants [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. For crop fields, a
grid-based perceptron model to detect elephants eficiently is proposed [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Although camera-based
methods have been recently used by many researchers to detect animals on roads, there are many
limitations inherent in these techniques. However, advanced deep learning techniques have successfully
been used to detect animals in colored images. Solution for detecting animal on roadways is necessary
and has to be accomplished in a short timeframe of time so that it is time eficient.
      </p>
      <p>
        Sugumar and Jayaparvathy [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] designed a system for elephants that relies on the extraction of visual
features. With the help of computer vision, computers and machines are able to trained to recognize
people actions, behavior, as well as dialect in a manner similar to people. Visual computing is a part of
Machine Learning (ML), which attempts to find patterns in videos as well as images, through coding
computers to analyze and understand the visual content that is encoded into digital data.
In image and video processing and image captioning, deep learning is widely used [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In deep learning
(DL), artificial neural networks imitate the working belonging to individual intellect. Like people
intellect, device is also grasp by machine learning with neural networks. Over the last few years DL has
been applied to an incredibly wide spectrum of machine learning problems and the ’DL ecosystem’ is
rapidly evolving. Object detection is the ’locating’ and ’classifying’ of objects. Object detection includes
locating relevant elements, hailing bounding boxes over them and then classify everything. This can be
accomplished using machine learning (ML) and deep learning (DL) methods, with implementation of
techniques such as YOLOv7 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        A comparable result on various tasks, including image classification, object detection, semantic
segmentation, panoptic segmentation, pose estimation, etc. achieved by VAN [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Implementing an automated
system for detecting animals plus providing warnings can assist to minimize vehicle-animal clashes
on roads as well as highways [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Van Gemert et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] uses automatic animal
counting and warning system. El Abbadi et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], Tan et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], Ulhaq et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] identify and classify
the animals. Jawaharlalnehru, Arunnehru, et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] proposed an improved YOLO with integration of
SSD as its foundational model.
      </p>
      <p>Each algorithm has its advantages and disadvantages, and the choice of which algorithm to use depends
on the specific requirements of the problem. This paper also explores detection methods based on deep
learning that are used to identify elephants on the streets, railway tracks or human localities as well as
sending an informational alert to the concern authorities, such that appropriate action might be taken.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed system and its components</title>
      <sec id="sec-3-1">
        <title>3.1. Dataset construction</title>
        <p>The dataset utilized in this research was sourced through the internet for the purpose of elephant
detection. Images were collected featuring elephants in diverse orientations, lighting conditions, and
backgrounds. Various deep learning techniques can be applied to enhance these images. Additionally,
a Python script was employed to extract images from videos. Figure 1 illustrates the process of data
collection and annotation for the elephant dataset using the LabelImg tool to prepare for training.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Dataset</title>
        <p>
          Basically, there are freely available datasets of elephant’s images are present on some open-source free
websites, like: “The Aerial Elephant Dataset" [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], “Wild Elephant Dataset" [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] and “Asian vs African
Elephants" [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Data pre-processing</title>
        <p>Data preprocessing is one step to improve the quality of data. However, this process involves organizing
plus processing raw data to generate results that are readable as well as easily available. Complexity,
accuracy and suficiency are common challenges to image data. However, the importance of the
image data processing is something that has been under investigated in data science. This includes
grayscale conversion, normalization, data augmentation and image standardization. In this study, the
data augmentation is employed to expand as well as adjusted the dataset size plus resized the images
accordingly.</p>
        <p>Towards preparing the training dataset, the frames are extracted from the videos. However, most
of these frames did not contain any elephants. To ensure robust results, the frames that only retained
where confirmed elephants included, discarding the others.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Data annotation</title>
        <p>To annotate the dataset, the “LabelImg" open-source tool is used. Users draw bounding boxes around
areas of interest and label the classes. After saving, a text file is generated, where the first decimal value
indicates the class ID is listed first, followed by the x-axis and y-axis centers, width as well as height.
The dataset is subsequently split into training as well as testing portions for additional processing.
LabelImg can export annotations in various formats, including YOLO. If the dataset is labeled with
LabelImg, annotations can be directly exported in YOLO format, simplifying the conversion process.</p>
        <p>LabelImg provides key features for eficient image annotation:
• User-Friendly Interface: Built with Python and Qt.
• Annotation Formats: Supports PASCAL VOC XML, YOLO, and CreateML.
• Annotation Modes: Detection, segmentation, and classification.
• Pre-defined Classes: Easily create and manage classes for labeling.
• Keyboard Shortcuts: Quick actions for bounding boxes, saving, and navigation.</p>
        <p>• Remote Access: Annotate images directly from remote servers.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Cloud computing</title>
        <p>The vast amount of data collected from Internet of Things (IoT) devices must be stored on a secure
server, with cloud computing playing a crucial role in this process. Once the data is processed and
analyzed, it helps identify electrical faults, errors, and other system issues more efectively.</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.6. Network connection</title>
        <p>An internet connection is crucial for communication, with each physical object assigned an Internet
Protocol (IP) address. However, as device usage grows, the limited number of available IP addresses
will become inadequate, prompting to explore alternative identification methods.</p>
      </sec>
      <sec id="sec-3-7">
        <title>3.7. Deep learning</title>
        <p>Deep learning is a subset of machine learning within artificial intelligence that utilizes neural networks
to learn from unstructured or unlabeled data. Also known as deep neural learning or deep neural
networks, it automates learning processes and is inspired by the structure of the human brain.</p>
      </sec>
      <sec id="sec-3-8">
        <title>3.8. YOLOv7 architecture overview</title>
        <p>YOLOv7, built on the Extended Eficient Layer Aggregation Network (E-ELAN), enhances speed and
accuracy with an optimized layer design. Key features include:
• Model Scaling: Customizes depth, width, and resolution for various tasks.
• Auxiliary Head: Provides extra supervision during training.
• Enhanced Loss Function: Boosts training eficiency.
• High Performance: Achieves faster, more accurate inference than YOLOv5 and YOLOv4, with
fewer parameters and lower computational costs.</p>
      </sec>
      <sec id="sec-3-9">
        <title>3.9. Proposed model</title>
        <p>This work proposes a framework designed for real-time processing elephant identification plus alert
creation to protect both humans and elephants. Cloud services are utilized to connect cameras deployed
in various hotspots, with captured images kept in a database for model inspection. For actual deployment
of this trained model, a local cloud is utilized to host the model and establish the connection with
the area’s cameras, ensuring protection against elephant attacks. This local environment is then
incorporated into a global cloud via internet, forming an indispensable component of the system.</p>
        <p>The model consists of two phases. In the first phase, the YOLOv7 model is optimized with adjustable
metrics using samples from an open-source database, incorporating a Visual Attention Network (VAN)
layer. The developed models are evaluated using data from AI-connected cameras positioned near
localized hotspots, generating diferent datasets from various locations. Performance variations from the
optimized models are analyzed. During the second phase, the trained model is implemented alongside
a local cloud server to detect elephants in the area. Upon detection, alerts are sent to local forest
authorities and the community as a warning. This Proposed Model states that an integration of YOLOv7
with Visual Attention Network (VAN) will yield an improved version of YOLOv7.</p>
        <p>In real time also the model will work eficiently with real time data captured through webcam. This
deployment is depicted in Figure 2.</p>
        <p>The experimental results compile various studies conducted over the past years in animal
reidentification and attribute prediction, unifying them under a common framework. The recognition
systems developed for diferent animal species share a similar algorithmic design approach. While the
original publications in Machine vision focused on technical aspects belonging to these algorithms,
this work examines them from an application perspective. It contextualizes these studies in relation
to one another, emphasizing their shared approach and their capacity to incorporate lifelong learning
techniques that can enhance the decision models used.</p>
        <p>New results are presented for identifying elephants in camera trap videos, showcasing advanced
capabilities for monitoring animals beyond individual identification. These advancements include
predicting individual attributes and implementing lifelong learning with human is the approach. The
latter is increasingly important for real-world applications and long-term monitoring.</p>
      </sec>
      <sec id="sec-3-10">
        <title>3.10. YOLOv7 implementation details</title>
        <p>To train a model on Google Colab, start by creating fresh computer notebook, setting change runtime
type to GPU, and running code to clone this “YOLOv7 repository" plus install necessary segments.
Download and extract the dataset, then acquire the “YOLOv7" pretrained weights (yolov7.pt) for faster
training. Ensure paths are correct inside ‘data.yaml’ record within ‘yolov7/dataset’ directory. Once
training completes, optimal weights are stored within the ‘runs’ directory. With this weight file ready,
you can proceed with evaluation and inference using the ‘detect.py’ script to identify elephants in
images.</p>
        <p>To train the YOLOv7 model, utilize Google Colab by creating fresh computer notebook, setting
change runtime to GPU, and running this provided code to clone this YOLOv7 repository plus install
necessary segments. Download and extract the dataset into a folder, then obtain the “YOLOv7" weight
ifle, “yolov7.pt," for finetuning pretrained weights instead of starting from scratch. Create a data.yaml
ifle to specify class names and verify that paths inside data.yaml record located in yolov7/dataset
directory. After training, best weights will be stored inside ’runs’ directory. After obtaining this weight
ifle, one can apply “detect.py” script to perform inference plus identify elephants in images.</p>
      </sec>
      <sec id="sec-3-11">
        <title>3.11. Improved YOLOv7 with Visual Attention Network (VAN)</title>
        <p>
          The attention framework is widely employed in machine learning plus deep learning algorithms. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
Integration of VAN with YOLOv7 has been derived in the Figure 3. To visualize YOLOv7’s integration
with the Visual Attention Network (VAN), a diagram can illustrate data flow and process steps.
        </p>
        <p>Components overview:
1. Input Image: The original image or video frame to be processed for object detection.
2. YOLOv7 Object Detection Module: Detects objects in real-time, outputting bounding boxes and
class labels for detected objects.
3. Extract Regions of Interest (ROIs): Crops areas of detected objects based on YOLOv7’s bounding
boxes for focused analysis.
4. Visual Attention Network (VAN): Applies attention mechanisms to ROIs to enhance feature
extraction, focusing on relevant areas and improving representation.
5. Post-Processing: Combines VAN’s outputs with YOLOv7’s results, refining detection accuracy
with attention maps or enhanced bounding boxes.
6. Final Output: An image with detected objects and VAN’s enhancements, such as attention maps
or refined bounding boxes.</p>
        <p>Conclusion: This integrated approach leverages YOLOv7’s real-time detection and VAN’s advanced
attention mechanisms to improve object detection by isolating ROIs and applying targeted feature
extraction, enhancing accuracy and performance in computer vision applications.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and discussion</title>
      <p>Various hyperparameters, such as scaled image size and batch size, can significantly influence the
identification accuracy of the Improved YOLOv7 model. Metrics such as precision, recall, F1-score, and
accuracy are commonly used to evaluate the model’s performance. Precision measures the ratio of
correct predictions to the total number of positive predictions (false positives included), while recall
assesses the proportion relating to correct predictions to the total actual positives (including false
negatives). Accuracy is calculated based on both the count of accurate predictions also the count of error
ones. Overall accuracy of the proposed model was computed using the formula provided in Equation
(1).</p>
      <p>An object detection module was developed to identify elephants in localized hotspot areas. The
accuracy and speed of the proposed Improved YOLOv7 model were compared to those of the earlier
version of YOLOv7 for elephant detection. Before testing, the YOLOv7 algorithms were trained and
validated using an elephant detection dataset. The confusion matrix of elephant detection in improved
YOLOv7 and YOLOv7 has been shown in Figure 4 and Figure 5 respectively.</p>
      <p>There are mainly two classes i.e., “elephant" and “not an elephant". The total 5,535 images of prepared
customed dataset have been taken for evaluating the models, which includes 4,324 elephant images and
1,211 non elephant images.</p>
      <p>The models have been trained with 3,165 number of original elephant images and 710 number of
non-elephant images. The image dataset details for models evaluation has been shown in Table 1.</p>
      <p>The testing or validation of the models have been done with 1,660 images, out of which 1,159 images
are positive, which contains elephants and 501 images are negative, which does not contain any elephant
portrait. Table 2 depicts the results of the experiments.</p>
      <p>The accuracy of improved YOLOv7 during training shows 98%, while during testing or validation
shows only 97%. The detailed individual results of the models have been shown in the Table 3 for
YOLOv7 model and Table 4 for the improved YOLOv7 model.</p>
      <p>Correct predictions means true positives plus true negatives of the model during testing phase.
Whereas, total predictions means summation of all predicted values. The object detected image of
elephant is shown in Figure 6 evaluated in YOLOv7 model.</p>
      <p>The object detected image of elephant evaluated in improved YOLOv7 model is shown in Figure 7.</p>
      <p>Accuracy: It is defined as the ratio of correct predictions made by the proposed model to the total
number of predictions. This metric is particularly efective when the classes of the target variable are
balanced within the dataset. It can be represented as follows.</p>
      <p>=
 
  
=</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper an integration of YOLOv7 with Visual Attention Network (VAN) is stated, that yields an
improved version of YOLOv7. Incorporating VAN into YOLOv7 enables fast inference backed by the
ability to understand what it sees. Combined, accuracy, reliability, and velocity optimize in complex
conditions, for various actual-time domains. The deep learning algorithm is combined with a Visual</p>
      <p>Attention Network to improve the control of parameters, selection of parameters, and convergence rate
to improve the image detection and classification models. In order to prove the eficacy of the proposed
method, it was implemented on an open-source elephant dataset. Both the experimental outcome and
computational analysis suggest that the enhanced YOLOv7 has robust optimization features and is
capable of excelling in elephant recognition and diferentiation notably in the concerns of precise, grasp,
and F1 measure. The proposed model achieved training accuracy of 98% and validation (or testing)
accuracy of 97%.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saxena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>An animal detection and collision avoidance system using deep learning</article-title>
          ,
          <source>Advances in Communication and Computational Technology: Select Proceedings of ICACCT 2019. Springer Singapore</source>
          <volume>668</volume>
          (
          <year>2021</year>
          )
          <fpage>1069</fpage>
          --
          <lpage>1084</lpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-15-5341-7_
          <fpage>81</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Mandal</surname>
          </string-name>
          ,
          <article-title>A prototype model to detect elephants near the railway tracks</article-title>
          ,
          <source>Advances in Modelling and Analysis B</source>
          <volume>63</volume>
          .1-
          <fpage>4</fpage>
          (
          <year>2020</year>
          )
          <fpage>7</fpage>
          -
          <lpage>9</lpage>
          . doi:
          <volume>10</volume>
          .18280/ama_b.
          <fpage>631</fpage>
          -
          <lpage>402</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Mandal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Bhutia</surname>
          </string-name>
          ,
          <article-title>A proposed artificial neural network (ann) model using geophone sensors to detect elephants near the railway tracks</article-title>
          ,
          <source>Advanced Computational and Communication Paradigms: Proceedings of International Conference on ICACCP 2017</source>
          , Volume
          <volume>2</volume>
          , Springer Singapore 2 (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-10-8237-
          <issue>5</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Mullick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Mandal</surname>
          </string-name>
          ,
          <article-title>A proposed grid-based elephant detection model using artificial intelligence (ai) to prevent crop damage in farming fields</article-title>
          ,
          <source>Doctoral Symposium on Intelligence Enabled Research</source>
          , Singapore, Recent Trends in Intelligence Enabled Research, DoSIER
          <year>2023</year>
          ,
          <source>Advances in Intelligent Systems and Computing</source>
          , Springer Nature Singapore
          <volume>1457</volume>
          (
          <year>2023</year>
          )
          <fpage>55</fpage>
          -
          <lpage>66</lpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-97-2321-
          <issue>8</issue>
          _
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Sugumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jayaparvathy</surname>
          </string-name>
          ,
          <article-title>Automated unsupervised elephant image detection system as a solution to human elephant conflict</article-title>
          ,
          <source>Proceedings of the International Conference on Multimedia Processing. Communication and Information Technology</source>
          ,
          <string-name>
            <surname>MPCIT</surname>
          </string-name>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>U.</given-names>
            <surname>Sirisha</surname>
          </string-name>
          , S. C. B,
          <article-title>Semantic interdisciplinary evaluation of image captioning models</article-title>
          ,
          <source>Cogent Engineering</source>
          <volume>9</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1080/23311916.
          <year>2022</year>
          .
          <volume>2104333</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.-Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bochkovskiy</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-Y. M. Liao</surname>
          </string-name>
          ,
          <article-title>Yolov7: Trainable bag-of-freebies sets new state-of-theart for real-time object detectors</article-title>
          ,
          <source>Proceedings of the IEEE/CVF conference on computer vision and pattern recognition</source>
          (
          <year>2023</year>
          )
          <fpage>7464</fpage>
          -
          <lpage>7475</lpage>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2207.02696.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chao</surname>
          </string-name>
          , J.-K. Cheng, M. Zhou,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <article-title>Animal detection and classification from camera trap images using diferent mainstream object detection architectures</article-title>
          ,
          <source>Animals</source>
          <volume>12</volume>
          .15 (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .3390/ani12151976.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>H. R.</given-names>
            <surname>Sodagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.-P.</given-names>
            <surname>Rezaee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shekarian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rahmati</surname>
          </string-name>
          ,
          <article-title>E-learning of router applications to drivers in order to reduce collisions and road accidents with wild animals</article-title>
          ,
          <source>Interdisciplinary Journal of Virtual Learning in Medical Sciences 13.1</source>
          (
          <year>2022</year>
          )
          <fpage>63</fpage>
          -
          <lpage>65</lpage>
          . doi:
          <volume>10</volume>
          .30476/ijvlms.
          <year>2022</year>
          .
          <volume>94592</volume>
          . 1139.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gandhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Yadav</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rathee</surname>
          </string-name>
          ,
          <article-title>A novel approach of object detection using deep learning for animal safety</article-title>
          ,
          <year>2022</year>
          , 12th International Conference on Cloud Computing, Data Science &amp;
          <string-name>
            <surname>Engineering (Confluence). IEEE</surname>
          </string-name>
          (
          <year>2022</year>
          )
          <fpage>573</fpage>
          -
          <lpage>577</lpage>
          . doi:
          <volume>10</volume>
          .1109/Confluence52989.
          <year>2022</year>
          .
          <volume>9734225</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Sato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Zanella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J. X.</given-names>
            <surname>Costa</surname>
          </string-name>
          ,
          <article-title>Computational classification of animals for a highway detection system</article-title>
          ,
          <source>Brazilian Journal of Veterinary Research and Animal Science</source>
          <volume>58</volume>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          . doi:
          <volume>10</volume>
          .11606/issn.1678-
          <fpage>4456</fpage>
          .bjvras.
          <year>2021</year>
          .
          <volume>174951</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Munian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Martinez-Molina</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Alamaniotis, Intelligent system for detection of wild animals using hog and cnn in automobile applications</article-title>
          ,
          <source>2020 11th International Conference on Information, Intelligence, Systems and Applications (IISA)</source>
          ,
          <source>IEEE</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . doi:
          <volume>10</volume>
          .1109/IISA50023.
          <year>2020</year>
          .
          <volume>9284365</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S. U.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>A practical animal detection and collision avoidance system using computer vision technique</article-title>
          ,
          <source>IEEE access 5</source>
          (
          <year>2016</year>
          )
          <fpage>347</fpage>
          -
          <lpage>358</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2016</year>
          .
          <volume>2642981</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>J. C. van Gemert</surname>
            ,
            <given-names>C. R.</given-names>
          </string-name>
          <string-name>
            <surname>Verschoo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mettes</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Epema</surname>
            ,
            <given-names>L. P.</given-names>
          </string-name>
          <string-name>
            <surname>Koh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Wich</surname>
          </string-name>
          ,
          <article-title>Nature conservation drones for automatic localization and counting of animals</article-title>
          ,
          <source>Computer Vision-ECCV 2014Workshops: Zurich, Switzerland, September 6-7 and 12</source>
          ,
          <year>2014</year>
          , Proceedings,
          <source>Part I 13</source>
          , Springer International Publishing (
          <year>2015</year>
          )
          <fpage>255</fpage>
          -
          <lpage>270</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -16178-5_
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>N. K. E.</given-names>
            <surname>Abbadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M. T. A.</given-names>
            <surname>Alsaadi</surname>
          </string-name>
          ,
          <article-title>An automated vertebrate animals classification using deep convolution neural networks</article-title>
          ,
          <source>2020 International Conference on Computer Science and Software Engineering (CSASE)</source>
          ,
          <source>IEEE</source>
          (
          <year>2020</year>
          )
          <fpage>72</fpage>
          -
          <lpage>77</lpage>
          . doi:
          <volume>10</volume>
          .1109/CSASE48920.
          <year>2020</year>
          .
          <volume>9142070</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ulhaq</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Adams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. E.</given-names>
            <surname>Cox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Low</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pau</surname>
          </string-name>
          ,
          <article-title>Automated detection of animals in low-resolution airborne thermal imagery</article-title>
          ,
          <source>Remote Sensing 13.16</source>
          (
          <year>2021</year>
          )
          <article-title>3276</article-title>
          . doi:
          <volume>10</volume>
          .3390/ rs13163276.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jawaharlalnehru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Sambandham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sekar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ravikumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Loganathan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kannadasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wechtaisong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Haq</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Alhussen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. S.</given-names>
            <surname>Alzamil</surname>
          </string-name>
          ,
          <article-title>Target object detection from unmanned aerial vehicle (uav) images based on improved yolo algorithm</article-title>
          ,
          <source>Electronics</source>
          <volume>11</volume>
          .15 (
          <year>2022</year>
          )
          <article-title>2343</article-title>
          . doi:
          <volume>10</volume>
          .3390/electronics11152343.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>[18] The aerial elephant dataset</source>
          ,
          <year>2024</year>
          . URL: https://zenodo.org/records/3234780, [Accessed in May 15,
          <year>2024</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Wild</surname>
            <given-names>elephant dataset</given-names>
          </string-name>
          ,
          <year>2024</year>
          . URL: https://www.kaggle.com/datasets/gunarakulangr/ sri-lankan
          <article-title>-wild-elephant-</article-title>
          <string-name>
            <surname>dataset</surname>
          </string-name>
          ,
          <source>[Accessed in May 15</source>
          ,
          <year>2024</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <article-title>Asian vs african elephants</article-title>
          ,
          <year>2024</year>
          . URL: https://www.kaggle.com/datasets/vivmankar/ asian
          <article-title>-vs-african-elephant-image-</article-title>
          <string-name>
            <surname>classification</surname>
          </string-name>
          ,
          <source>[Accessed in May 15</source>
          ,
          <year>2024</year>
          ].
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>