Potato Leaf Disease Detection using CNN - A Lightweight
                         Approach
                         Abhisek Saha1,*,† , Syed Mohammed Musharraf1,† , Anubhav Dey1,† , Hiranmoy Roy2,*,† and
                         Debotosh Bhattacharjee3,*,†
                         1
                           Netaji Subhash Engineering College, Techno City, Garia, Ranabhutia, West Bengal, Kolkata 700152, India
                         2
                           Department of Information Technology, RCC Institute of Information Technology, Canal South Road, Kolkata 700015, India
                         3
                           Department of Computer Science & Engineering, Jadavpur University, Kolkata-700032, India


                                     Abstract
                                     Detection of potato leaf diseases at an early stage is of great significance to the agricultural industry. The
                                     conventional tactics of disease identification that exist are either unreliable or very complex or costly making
                                     them not suitable as viable techniques. However, with a boom in the field of Artificial intelligence, many
                                     procedures have come up over the recent years to help solve this problem. Data being the fuel for such procedures,
                                     it is very important to source reliable and accurate data for the training purpose of the AI based models. The task
                                     of disease detection for potato leaves is quite challenging as the symptoms show a lot of variations depending
                                     upon the species, climate and environmental factors. The popular pretrained models used for this purpose are
                                     VGG16, Inception V3, ResNet50 which help us to classify diseases of plants. In our research we have tried to
                                     build a custom Convolutional Neural Network classification model which is more robust and light weight as
                                     compared to the existing approaches. The model is built with a very simple approach and is trained using two
                                     standard publicly available datasets namely “PlantVillage” and PLD. The correctness of the suggested model
                                     has shown promising and consistent output with accuracy of 99.3% and 99.23%, while implemented on the two
                                     datasets respectively. To achieve the said accuracy, we have used image Enhancement algorithm: CLAHE at the
                                     preprocessing stage after the data acquisition.

                                     Keywords
                                     Potato Leaf Disease, CNN, Image Enhancement, Image classification


                         1. Introduction
                         Since the dawn of human civilization, agriculture has played a crucial role in transforming people
                         from roving hunter-gatherers to established citizens [1]. It has facilitated the growth of large human
                         populations by providing a reliable and stable source of nutrition. The history of agriculture is a
                         long continuum of groundbreaking innovations, evolving rapidly through industrial revolutions and
                         advancements in modern science, particularly in the 20th and 21st centuries. Unlike a single, definitive
                         moment, the origin of agriculture in human civilization unfolded over centuries and cannot be precisely
                         dated. Researchers concur that early Homo Sapiens began transitioning from a nomadic lifestyle to
                         settling down, domesticating animals, and cultivating cereal seeds during the early Neolithic period,
                         known as the Neolithic Revolution [2]. This shift likely took place as glaciers retreated northward and
                         the climate warmed, approximately 10,000 years ago—though some estimates place it at 12,000 or even
                         15,000 years ago. This further led to the development of complex societies as humans learnt the ways of
                         trade and caused the growth of economy and exchange. From a contemporary standpoint, agriculture
                         is a very dynamic industry that is essential to meet the basic need of food for enitire population.
                         In a developing nation like India, agriculture is an important sector of its economy as it contributes
                         about 15% of total GDP and renders employment to about 60% of its residing population [3]. However,
                         despite being a formidable industry, due to the problem of crop diseases, the sector suffers havoc. Plant

                         The 2024 Sixth Doctoral Symposium on Intelligence Enabled Research (DoSIER 2024), November 28–29, 2024, Jalpaiguri, India
                         *
                           Corresponding author.
                         †
                           These authors contributed equally.
                         $ saha.abhisek@gmail.com (A. Saha); syedmdmusharraf@gmail.com (S. M. Musharraf); anubhavd56@gmail.com (A. Dey);
                         hiranmoy.roy@rcciit.org.in (H. Roy); debotoshb@hotmail.com (D. Bhattacharjee)
                                  © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
diseases can have a major effect on the leaves, fruits and different parts of crops, which degrades
quality of crops and yield [4]. This, in turn, contributes to food scarcity and insecurity on a global scale.
It is estimated that crop diseases cause an annual loss of around 16% in global crop yields, making them
a major factor behind famines and rising production costs. According to predictions from the Food
and Agriculture Organization (FAO), there will be 9.5 billion people on the earth by next thirty years
[5], meaning that a 75% increase in food production is necessary to provide a consistent supply of
food. A group of factors that impact plants and their products are diseases and illnesses. As opposed to
disorders, which are mostly caused by factors such as rainfall, temperature, moisture, and nutritional
deficiencies, illnesses are caused by biotic agents such as fungi, bacteria, and algae [6].
For the well-being of the crops, early and precise identification of diseases is necessary, so that the
correct cure can be applied in time. Various methods are available for diagnosing plant diseases, with
one of the simplest ones being visual inspection. Conventional diagnostic techniques frequently
depend on the farmer’s knowledge, which can be erratic and untrustworthy. Due to the extensive,
time-consuming process needed and the restricted availability of experts in remote places [1], this
strategy is frequently impracticable. To improve accuracy, researchers have introduced spectrometers
to distinguish between healthy and infected plant leaves [7]. Another technique involves extracting leaf
DNA using the PCR [8]. These methods are complex, costly, and time-consuming, requiring specialized
skills, controlled experimental conditions, and extensive use of crop safety equipments. The role of
artificial intelligence is immensely significant in this aspect. If the training of deep learning based
models, is done on labeled samples, we can perform automated, efficient and accurate leaf disease
detection [9]. We have chosen potato as the crop of our concern and our model serves to categorically
classify blight disease both early and late, as well healthy leaf images. Potato (Solanum tuberosum) [10]
is a temperate crop grown under subtropical conditions in India. Soils that are loose, muddy, and sandy
and rich in organic matter are ideal for growing potato crops; alkaline and saline soils are not suited
[11]. In addition to being a great source of fiber and heart disease prevention, potatoes are vital for
overall health. Their high antioxidant content aids in the defense against diseases including excessive
cholesterol and irregular blood sugar levels [12]. The nation has been growing potatoes for more than
three centuries. It is currently one of the most widely grown crops in this nation for vegetable purposes.
Potatoes are a cost-effective food that contributes inexpensive energy to the human diet.
For the purpose of training our mode, we have made use of the publicly available dataset “PlantVillage”
and PLD. Both these datasets contain potato leaves images for both blight diseases, Early and Late as
well as healthy also. Pre-indication of the disease Early Blight can be seen initially towards the base of
the plant, with roughly circular brown colored spots on the leaves and stems. This infection caused by
Fungus can be deadly for tubers, leaves and stems, causing many problems like reduced tuber size, low
produce and crop yield. One more fungal disease that affects potato crops is late blight, which first
manifests as patches on the stems. They spread out very quickly, creating big, dark brown, and black
areas that frequently look oily [1]. However, a marked variation in these symptoms can be observed
based on the region it is taken from, the climate, the species etc., making it a difficult task to build a
generalized classifier.
Kamal et. al.[13] developed two models based on MobileNet architecture and applied the architecture
on PlantVillage dataset. They have achieved an accuracy of 97.65% and 98.34% from Reduced MobileNet
and Modified MobileNet respectively for classification of 55 classes in PlantVillage dataset. Liang et.al.
Two different types of deep architectures where the first one based is on residual learning and the
second one is based on attention mechanism, have been developed by Karthik and his co-authors
[14]. The model based on the attention mechanism has attained an accuracy of 98% on the public
dataset named plantvillage for the detection of diseased tomato leaf. Khamparia et. al. [15] designed a
hybrid approach of developing an architecture for disease detection of potato leaves using Deep Neural
Network and autoencoders and attained an accuracy of 97.50% on PlantVillage dataset. Islam et. al.
[16] presented an approach which integrates the technique of image processing and machine learning
to classify potato leaf diseases. They used the renowned machine learning algorithm SVM in their
recommended solution and attained an accuracy of 95% over 300 images. For the identification and
classification of cassava illness, Sambasivam and his team[17] developed a deep neural network that
was trained on a very small dataset with significant class imbalance. A possible solution to the illness
identification accuracy problem with the least amount of time investment is the Kuan filtered Hough
transformation based reweighted linear program boost classification (KFHT-RLPBC) technique, which
was introduced by Nagarjan and his co-authors[18]. By using the PlantVillage dataset, they achieved
an accuracy of 92%. Geetharamani et.al. [19] recommended a deep Convolutional Neural Network
which can effectively identify and solve the problem of plant leaf diseases. They intended to carry out a
more thorough analysis of the training procedure without utilizing the tagged photos after achieving
an accuracy rate of 96.46% in the PlantVillage dataset. Table 1 discusses the synopsis of related works.
Most of the current methods for crop leaf disease detection have used the popular transfer learning
models, but these models often have a higher degree of parameters which leads to the problem
of computational complexity. Other approaches have implemented custom Convolutional Neural
Networks (CNNs) with reduced parameter counts; however, they generally fall short in achieving
significant amount of accuracy. Thus, there is a need for a more efficient, lightweight model for crop
leaf disease detection.
In order to detect diseases in potato leaves, we provide a novel lightweight Convolutional Neural
Network (CNN) in this research that can recognize both simple and abstract patterns. The model
architecture, illustrated in Figure 1, comprises convolutional layers, maxpooling layers for edge
feature extraction, batch normalization to normalize the input neuron values, dropout layers to reduce
overfitting, and fully connected layers for classification purpose. We have also used the Contrast
Limited Adaptive Histogram Equalization (CLAHE) technique for image preprocessing.The major
contributions are as follows.

    • The model introduces a lightweight CNN architecture which is both less complex with respect to
      number of parameters and highly accurate for leaf disease detection of potato leaves.
    • Through the use of convolution and max pooling layers, it is able to capture both intricate patterns
      and minute details.
    • The addition of CLAHE raises the picture quality, which strengthens the model’s capacity to
      accurately detect damaged potato leaves.
    • Data augmentation techniques used to make the training and testing datasets larger and more
      balanced. The classifier’s capacity for generalization increases with the use of various data
      augmentation strategies.
The study is organized as follows: the proposed CNN architecture is thoroughly explained in Section 2.
The work is concluded in Section 4, while the experiments and comparison findings are presented in
Section 3.

Table 1
Summary of disease prediction of potato leaves
 Author    Algorithm                                             Dataset                 Plant          Accuracy
  [13]     Modified Mobilenet                                  Plantvillage             Potato           98.34%
  [20]     Resnet50                                            Plantvillage             Potato            98%
  [14]     Attention Based Residual Network                    Plantvillage            Tomato             98%
  [17]     CNN                                              Cassava Challenge          Cassava            93%
  [18]     Reweighted Linear Boost Program Classification      Plantvillage            Multiple           92%
  [15]     CNN and Auto Encoders                               Plantvillage      Potato,Maize,Tomato     97.5%
  [19]     Deep CNN                                            Plantvillage             Potato           96.46%
  [16]     Segment and Multi SVM                               Plantvillage             Potato            95%


2. Proposed Methodology
Three subsections comprise this section: Data Preprocessing, Acquisition of Data, and Classification.
2.1. Acquisition of Data
The images of potato leaf diseases were sourced from two publicly available datasets: the PLD dataset[5]
and the PlantVillage dataset[21] . Both datasets include two types of blight disease early, late as well
as healthy images . The PlantVillage dataset provided a sum of 2,152 images— 1,000 each for both
kind of blight diseases and 152 for healthy leaves (refer table-4). Owing to the limited quantity and
imbalance of images, additionally, 3,251 photos from Pakistan’s Central Punjab were included from the
PLD dataset.This dataset contains 816 healthy images, 1,303 early blight images, and 1,132 late blight
images after redundancy has been removed (refer to table-5).All photos are saved in uncompressed JPG
style and have RGB color profiles.

2.2. Data Preprocessing
As preproceesing stage, we employed CLAHE [22] algorithm . CLAHE is an image processing technique
designed to enhance image contrast. CLAHE operates on discrete areas of the image, known as tiles,
as opposed to the full image at once, in contrast to conventional histogram equalization. Within each
tile, CLAHE adjusts the contrast adaptively based on the local histogram, allowing it to enhance detail
without overly amplifying noise. After processing each tile, neighboring tiles are merged smoothly to
prevent visible boundaries. CLAHE is designed to prevent excessive contrast amplification by limiting
it. The contrast amplification around each pixel is determined by a slope function transformation. To
control this amplification, CLAHE clips the histogram at a predefined threshold before computing the
cumulative distribution function, effectively constraining the enhancement to avoid noise exaggeration.
By limiting contrast adjustments in uniform areas, CLAHE minimizes noise amplification, producing a
balanced, enhanced image.

2.3. Classification
Image processing, recognition, and classification are the main applications for CNNs, a kind of deep
learning method. The architecture of the human brain served as the model for CNN’s design. The
similarity of both can be understood as human brain has neurons and in the neural networks, it is
the neurons which are the backbone of the entire system. A CNN consists of several layers namely
convolutional, maxpooling, dropout and fully connected .
The convolution operation for a 2D convolutional layer can be represented as:
                                        𝑀 ∑︁
                                        ∑︁ 𝑁 ∑︁
                                              𝐶
                             𝑦𝑝,𝑞,𝑟 =                 𝑥𝑝+𝑖,𝑞+𝑗,𝑘 · 𝑤𝑖,𝑗,𝑘,𝑟 + 𝑏𝑟
                                        𝑖=1 𝑗=1 𝑘=1

    • 𝑦𝑝,𝑞,𝑟 : Output feature map at position (𝑝, 𝑞) in channel 𝑟.
    • 𝑥𝑝+𝑖,𝑞+𝑗,𝑘 : Input feature map at position (𝑝 + 𝑖, 𝑞 + 𝑗) in channel 𝑘.
    • 𝑤𝑖,𝑗,𝑘,𝑟 : Convolution filter weight of size (𝑀 × 𝑁 ) for channel 𝑘 and output channel 𝑟.
    • 𝑏𝑟 : Bias term for channel 𝑟.

For max-pooling, the operation can be written as:

                                        𝑦𝑝,𝑞,𝑟 = max (𝑥𝑝+𝑖,𝑞+𝑗,𝑟 )
                                                  𝑖,𝑗

    • 𝑦𝑝,𝑞,𝑟 : Output of max-pooling at position (𝑝, 𝑞) in channel 𝑟.
    • 𝑥𝑝+𝑖,𝑞+𝑗,𝑟 : Input feature map over a pooling window defined by (𝑖, 𝑗).

The following is an expression for a Fully connected layer’s output.
                                                  𝑁
                                                 ∑︁
                                          𝑦𝑗 =          𝑤𝑗,𝑖 𝑥𝑖 + 𝑏𝑗
                                                 𝑖=1
    • 𝑦𝑗 : Output of the 𝑗 th neuron in the layer.
    • 𝑥𝑖 : Input from the 𝑖th neuron in the previous layer.
    • 𝑤𝑗,𝑖 : Weight connecting the 𝑖th neuron in the previous layer to the 𝑗 th neuron.
    • 𝑏𝑗 : Bias term for the 𝑗 th neuron.

Softmax function, used to convert final layer’s logits into probabilities, is given by:

                                                       𝑒 𝑧𝑖
                                             𝜎(𝑧)𝑖 = ∑︀𝑁
                                                                 𝑧𝑗
                                                         𝑗=1 𝑒

    • 𝜎(𝑧)𝑖 : Softmax output for class 𝑖.
    • 𝑧𝑖 : Logit (raw output) for class 𝑖.
    • 𝑁 : Total number of classes.

   The funct of these layers is to detect the features like edges and complex patterns. To extract features
and different kinds of edges, various types of filters will be used as per the requirement. For Potato
Disease Detection, we used the three well known transfer learning models: VGG16, InceptionV3 and
ResNet-50.
VGG16 [23] is a deep CNN architecture which has been introduced for image classification in the
year 2014. Its architecture is based on the input size 224x224 pixels of RGB images. It consists of
total 16 layers comprises with 13 convolutional and 3 fully connected layers. Activation function relu
is employed in all the layers. VGG16 is trained on ImageNet dataset and is competent enough for
the classification of images into 1000 classes. categories and detecting objects from 200 classes. The
convolution layer has 3x3 filters with increasing number of filters to detect the complex hierarchal
patterns in the images. Max pooling layers of size 2x2 with a stride of 2 have been used to extract
features that select the maximum valued pixel within each small region. After the feature extraction
layers, there are 2 fully connected layers of 4096 neurons and finally there is a fully connected layer of
1000 neurons for classification purpose.
The 48-layer deep pretrained CNN InceptionV3 [24] was trained on the ImageNet dataset and can
categorize images into 1000 different categories. The network is based on the input size of 299x299
pixels of RGB images. The layers architecture of InceptionV3 consists of Inception modules where
each module is combined of 1x1, 3x3, and 5x5 convolutions. InceptionV3 has fewer no. of parameters
because of factorizing convolutions. A convolution of 5X5 filter can be replaced by two 3x3 filters. In
this context, for a 5x5 filter it requires 25 parameters but for two 3x3 filters it requires 18 parameters. It
will reduce the no. of parameters by 28% without losing the ability to capture patterns by a 5x5 filter.
Because of this light weight architecture, it is computationally efficient to work on.
ResNet-50 [25] is a popular deep cNN architecture, which is a part of the ResNet (Residual Networks)
family, which was developed to tackle the problem of vanishing gradient in deep networks by introducing
residual blocks. It was first introduced in the year 2016.Convolutional layers, batch normalization, ReLU
activations, and skip (residual) connections make up this 50-layer deep model. 48 convolutional layers,
one max-pooling layer, and one average-pooling layer make up the layers. There are many residual
blocks in each of the four main stages of the model. Each residual block in ResNet-50 has a shortcut
connection that skips single or multiple layers, enabling the gradient to pass back through the network
without vanishing. They consist of three convolutional layers with 1x1, 3x3, and 1x1 convolutions. The
final fully connected layer in ResNet-50 typically has 1,000 output units for 1,000 classes in the ImageNet
dataset, which the model was originally trained on. However, ResNet-50 can be modified to handle
any number of classes by adjusting the number of output units in the final layer. This modification is
common in transfer learning, where the network is adapted to different datasets with fewer or more
classes.
For our experiment of potato disease detection, we propsed a custom CNN which comprises with 20
layers Table-3. The architecture of the proposed CNN model is depicted in following steps. Figure 1
and Figure 2 are representing the model architecture and detailed layerwise flow diagram respectively.
Figure 1: Paradigm of proposed model


    • The images of the datasets PlantVillage and PLD are enhanced using CLAHE for better clarity,
      noise reduction and ease of feature extraction.
    • Enhanced images undergo augmentation using a variety of techniques that make use of the
      dataset and improve the generalization capacity of the model.
    • Images are passed to the Input Layer which accepts an input of size 224 X 224 and 3 channels.
    • Images are subsequently batch normalized to normalize the output from neurons by calculat-
      ing the mean and variance across the mini batch during training, for better convergence and
      improvement of training accuracy.
    • Normalized images are then fed to our customized feature extractor comprising 5 blocks, each
      consisting of Convolution and Maxpooling layers. The Convolution layers are used to capture
      minute details for fine feature extraction. The Maxpooling layer on the other hand reduces the
      spatial dimensions by selecting a maximum value within a kernel, retaining the most prominent
      features. A stride of 1 has been maintained throughout, and activation function ReLU has been
      employed with the convolution layers to incorporate the absense of linearity. Only positive
      feature values are activated, allowing the model to learn intricate patterns.
    • Uniform kernel size of 3x3 has been used for the convolution layers, and all the Maxpooling
      layers used are of size 2x2. Block 1 consists of 1 Convolution layer of 32 filters, followed by a
      Maxpooling layer. Block 2 consists of 2 convolution layers of 64 filters each and a Maxpooling
      layer. Block 3 contains 2 convolution layers of 128 filters each and a Maxpooling layer. Two
      convolution layers of 256 filters and a Maxpooling layer is contained in Block 4. In Block 5, the
      last and final block, there are 2 convolution layers of 512 filters each and a Maxpooling layer.
    • The output from the Feature Extractor is fed into customized Classifier which first flattens the
      vectors to a single dimension comprising 8192 neurons, followed by a Dropout layer to avoid
      overfitting of the model. After this, dense layers are used which steps down the sizes from 1024
      to 256 to 64 neurons consecutively. The dense layers also employ the ReLU activation function.
    • The output of the last dense layer, which consists of three neurons representing the three classes,
      is then sent through the Softmax activation function to classify the image as either Healthy, Early
      Blight, or Late Blight.


Table 2
Augmentation parameters
                                  Sl No.   Operation used     Range
                                    1.       Rotation       30 Degree
                                    2.         Zoom            0.15
                                    3.      Width shift         0.2
                                    4.      Height shift        0.2
                                    5.         Shear           0.15
                                    6.     Hrizontal Filp      True

   Softmax is generalization of sigmoid function for multiclass classification which generates proba-
bilistic output between 0 and 1.Relu activation adds non-linearity to the model in each convolution
and max pooling layer by setting just the negative variables to zero while leaving the positive ones
unaltered. Details about hyperparameters are depicted in Table-9.


Figure 2: Flow diagram of proposed model


3. Experimental Result
All the experiments have been done using an IntelR CoreTM i5-1135G7 CPU, an NVIDIA Tesla T4 GPU
with 16 GB VRAM, 32 GB of RAM, and a Windows operating system. The deep learning implementation
was carried out using the TensorFlow 2.16.1 framework, Python 3.12.2, to accelerate neural network
operations. The proposed methodology has been implemented using the two public dataset namely PLD
and Plantvillage. Keras, framework of neural networks written in Python, has been used to implement
the model. There is total 2152 images from PlantVillage Dataset and 3251 images of PLD dataset which
have been used. To artificially increase the dataset and flexibility of the model, data augmentation
Table 3
An overview of the suggested neural network model
                           Layer                    Output Shape           Param #
                           Batch_normalization      (None, 224, 224, 3)    12
                           Convolution2D_1          (None, 223, 223, 32)   896
                           MaxPooling2D_1           (None, 111, 111, 32)   0
                           Convolution2D_2          (None, 109, 109, 64)   18,496
                           Convolution2D_3          (None, 107, 107, 64)   36,928
                           MaxPooling2D_2           (None, 53, 53, 64)     0
                           Convolution2D_4          (None, 51, 51, 128)    73,856
                           Convolution2D_5          (None, 49, 49, 128)    147,584
                           MaxPooling2D_3           (None, 24, 24, 128)    0
                           Convolution2D_6          (None, 22, 22, 256)    295,168
                           Convolution2D_7          (None, 20, 20, 256)    590,080
                           MaxPooling2D_4           (None, 10, 10, 256)    0
                           Convolution2D_8          (None, 8, 8, 512)      1,180,160
                           MaxPooling2D_5           (None, 4, 4, 512)      0
                           Flatten                  (None, 8192)           0
                           Dropout                  (None, 8192)           0
                           Dense_1                  (None, 1024)           8,389,632
                           Dense_2                  (None, 256)            262,400
                           Dense_3                  (None, 64)             16,448
                           Dense_4                  (None, 3)              195


technique has been employed in the dataset [26]. For both datasets, we have used the same techniques.
We used ImageGenerator class in Keras for augmentation purpose Table 2. We applied the parameters
to rotate the image about an angle of 30 degree and transform the images horizontally, vertically and
zoom into or out in a range of 15% randomly. With the learning rate set to 0.001 and the number of
epochs set to 200, batch size 32 has been utilized Table 9. Categorical cross-entropy is the loss function,
and the Adam optimizer is employed.
For the Plant Disease detection, we have used the publicly available PlantVillage Dataset [27] and PLD
Dataset. The PlantVillage dataset is a widely available and comprehensive benchmark dataset for crop
leaf disease classification. It includes 54,306 samples across 14 plant species, covering a total of 32 classes.
Among them, 26 classes are from diseased plants, while the remaining 12 classes belongs to healthy
plants.We selected three different kinds of potato leaf disease samples—late blight, early blight, and
healthy—from the "PlantVillage dataset" because our study focuses on potato leaf diseases prediction.
In all, 2152 images of potato leaves were used in our experiment; 1000 of these images showed early
blight, 1000 showed late blight, and the remaining 152 showed healthy in Table 4. Potato crops are
susceptible to a fungal ailment called early blight. The PlantVillage dataset has not an adequate number
of images and exhibits an uneven class distribution, so the PLD dataset has been used which has been
created in Pakistan’s Central Punjab region. From that dataset, we have rejected some images of potato
leaves due to redundancy. There is a total of 3251 images of potato leaves used amongst which 1303
pictures are from Early blight section, 816 pictures are from healthy section and 1132 from Late blight
section Table 5.
The model’s performance is calculated using standard validation metrics. To evaluate how well the
suggested model discriminated, the model’s accuracy, recall, precision, and f1-score were calculated. A
tabular method of displaying the prediction model’s performance is the confusion matrix. The number
of predictions the model made when it properly or erroneously identified the classes is indicated by an
entry in a confusion matrix.
A classifier’s True Positive (TP) is the number of predictions in which it correctly identifies the positive
class as positive. Conversely, True Negative (TN) refers to the quantity of predictions in which the
classifier correctly identifies the negative class as negative. False Positives (FP) are the quantity of
hypotheses in which the classifier incorrectly predicts the negative class as positive.False Negative (FN)
is the frequency with which the classifier incorrectly predicts the positive class as negative. Accuracy
provides the model’s overall accuracy, or the percentage of all samples that the classifier successfully
classified. Equation (1) can be used to determine accuracy.

                                                       𝑇𝑃 + 𝑇𝑁
                                   𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =                                                          (1)
                                               𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
Precision indicates the percentage of positive class predictions that came true. Equation (2) can be used
to determine precision.
                                                           𝑇𝑃
                                       𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =                                                    (2)
                                                       𝑇𝑃 + 𝐹𝑃
   Recall indicates the percentage of positive samples that the classifier accurately predicted to be
positive. Other names for it include probability of detection, sensitivity, and true positive rate (TPR).
Equation (3) can be used to determine precision.
                                                            𝑇𝑃
                                               𝑅𝑒𝑐𝑎𝑙𝑙 =                                                (3)
                                                          𝑇𝑃 + 𝐹𝑁
The F1 score, a mixture of the two measures, is frequently used by practitioners in machine learning
to balance the precision-recall score. It merges recall and precision into one metric. In terms of
mathematics, it is the harmonic mean of recall and precision. It can be computed by using Equation (4).
                                                          𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 * 𝑅𝑒𝑐𝑎𝑙𝑙
                                  𝐹 1 − 𝑆𝑐𝑜𝑟𝑒 = 2 *                                                    (4)
                                                          𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙


Table 4
PlantVillage Dataset
                                      Sl No.        Class       Sample size
                                        1.         Healthy         152
                                        2.       Late Blight       1000
                                        3.       Early Blight      1000


Table 5
PLD Dataset
                                      Sl No.        Class       Sample size
                                        1.         Healthy         816
                                        2.       Late Blight       1132
                                        3.       Early Blight      1303


Table 6
Comparative study of parameters with various Transfer Learning Model
                                                            Number of Parameters
                       Transfer Learning Model     Trainable Non-Trainable        Total
                       VGG16                         75,267     14,714,688     14,789,955
                       INCEPTIONV3                  153,603     21,802,784     21,956,387
                       RESNET50                     301,059     23,587,712     23,888,771
                       Proposed Model              11,011,849        6         11,011,855
Table 7
Comparative study of accuracy Measures of Suggested model with well-known Pretrained Models
          Model                           Type                 Precision      Recall     F1-score
                                          Micro Average              0.97      0.97        0.97
          VGG16                           Macro Average              0.98      0.90        0.93
                                          Weighted Average           0.97      0.97        0.97
                                          Micro Average              0.98      0.98        0.98
          ResNet50                        Macro Average              0.98      0.97        0.97
                                          Weighted Average           0.97      0.97        0.98
                                          Micro Average              0.77      0.77        0.77
          InceptionV3                     Macro Average              0.80      0.60        0.61
                                          Weighted Average           0.79      0.77        0.75
                                          Micro Average              0.99      0.99        0.99
          Proposed Model (PlantVillage)   Macro Average              0.99      0.99        0.99
                                          Weighted Average           0.99      0.99        0.99
                                          Micro Average              0.99      0.99        0.99
          Proposed Model(PLD dataset)     Macro Average              0.99      0.99        0.99
                                          Weighted Average           0.99      0.99        0.99


Table 8
Comparative study of accuracy measures with State of the Art Models
 Reference                                         Technique                Crop       No. of Diseases   Accuracy
 Divyansh[28]                              SVM, KNN and Neural Net          Potato           2            97.8%
 Zhang[29]                                       Faster RCNN                Tomato           4            97.1%
 Barman[30]                                        SBCNN                    Potato           2            96.75%
 Rozaqi[31]                                          CNN                    Potato           2             92%
 Proposed Model(Plantvillage dataset)                CNN                    Potato           3            99.3%
 Proposed Model(PLD dataset)                         CNN                    Potato           3            99.23%


Table 9
Hyperparameters
                                   Hyper-parameter       Description
                                   Convolution Layers          8
                                   Max-pooling Layers          5
                                   Dropout                    0.5
                                   Activation function       Relu
                                   Number of epochs           200
                                   Batch size                 32
                                   Learning rate             0.001


  A binary classifier’s diagnostic performance can be assessed visually using a ROC curve. Across a
range of threshold values, it compares True Positive Rate (TPR) against the False Positive Rate (FPR).
The FPR shows the percentage of actual negatives that are mistakenly categorized as positives, whereas
the TPR, or sensitivity, shows the percentage of actual positives that are accurately detected.

   The trade-off between sensitivity and specificity is illustrated by the ROC curve. Better performance
is indicated by a model whose curve is closer to the plot’s upper-left corner. This performance is
frequently measured using the Area Under the ROC Curve (AUC), where a value of 0.5 indicates no
discriminatory capacity and a value of 1 indicates flawless classification.
Figure 3: Confusion matrix generated by proposed methodology


Figure 4: Training validation accuracy of proposed methodology


Figure 5: Feature extraction output of different convolution layers


  The confusion matrix for both PlantVillage and PLD dataset is given in Figure 3. If we have a look at
Figure 6: Training validation loss of proposed methodology


Figure 7: Receiver Operating Characteristic curve proposed methodology


the graphs of the accuracy and loss for both the datasets PLD and PLantVillage 4, then it gives us the
idea of the convergence speed of the model. The training and validation loss graph for both the datasets
PlantVillage and PLD is given in Figure 6. The proposed model has outperformed the 3 transfer learning
models VGG16, ResNet50 and InceptionV3 comprehensively and has achieved a test accuracy of 99.3%
on PlantVillage dataset and 99.23% on the PLD dataset Table 8. The implementation process requires
way less hardware because of the presence of fewer parameters unlike the deep CNN architectures
Table- 6. The model also outperformed the other state-of-art models in terms of accuracy Table 8. The
performance of the proposed model as compared to the other transfer learning models in terms of
Precision, Recall and F1-score is given in Table 7. The two ROC curves for the two datasets PlantVillage
and PLD is given in Figure 7. Thus, the model gives us an efficient way to solve the problem of potato
leaf disease detection with a better accuracy than the pretrained models and less computational hazards.


4. Conclusion and Future Scope
Agriculture holds the most important economical aspect of our country as majority of the common
people have been heavily relied on Agriculture. Identifying illnesses that damage economically valuable
crops early on is crucial to preventing farmers from suffering financial losses related to these crops.
Potato is one of the most staple crops and our experiment is based on [26] the detection of the potato
leaf diseases. Our unique Convolutional Neural Network categorizes potato leaves into three groups for
that purpose, blight disease namely early and late along with healthy. Due to presence of fewer layers
and parameters than the other transfer learning and CNN models, it is highly efficient and resourceful
in the computationally constrained environments. It achieved more than 99% precision in PlantVillage,
as well as in the PLD data set.
The vision of our research is to improve the model’s adaptability and resilience so that it can identify
diseases in a variety of crops besides potatoes. A mobile and web application for the benefit of the
farmer community and contribution to this sector overall will be developed, along with an attempt to
further minimize the parameters to make the model more efficient.


Acknowledgments
Thanks to the developers of ACM consolidated LaTeX styles https://github.com/borisveytsman/acmart
and to the developers of Elsevier updated LATEX templates https://www.ctan.org/tex-archive/macros/
latex/contrib/els-cas-templates.


Declaration on Generative AI
The author(s) have not employed any Generative AI tools.


References
 [1] F. Arshad, M. Mateen, S. Hayat, M. Wardah, Z. Al-Huda, Y. Gu, M. A. Al-antari Aisslab, Pldpnet: End-
     to-end hybrid deep learning framework for potato leaf disease prediction, Alexandria Engineering
     Journal 78 (2023). doi:10.1016/j.aej.2023.07.076.
 [2] N. Przulj, A. Velimirovic, D. Petrović, P. Ilić, M. Mirosavljević, V. Trkulja, Z. Jovovic, From the
     Stone Hoe to Circular Agriculture, 2024, pp. 119–168.
 [3] P. Dalwadi, An analysis of india’s agricultural sector: Challenges and opportunities, EPRA Inter-
     national Journal of Multidisciplinary Research (IJMR) (2023) 293–296. doi:10.36713/epra13069.
 [4] Z. Iqbal, M. Khan, M. Sharif, J. Shah, An automated detection and classification of citrus plant
     diseases using image processing techniques: A review, Computers and Electronics in Agriculture
     153 (2018) 12–32. doi:10.1016/j.compag.2018.07.032.
 [5] D.-J. Rashid, I. Khan, A. Ghulam, S. Almotiri, M. Al Ghamdi, K. Masood, Multi-level deep
     learning model for potato leaf disease recognition, Electronics 10 (2021). doi:10.3390/
     electronics10172064.
 [6] S. Manzoor, S. Manzoor, S. Islam, J. Boudjadar, AgriScanNet-18: A Robust Multilayer CNN for
     Identification of Potato Plant Diseases, 2024, pp. 291–308. doi:10.1007/978-3-031-47724-9_
     20.
 [7] Y. SASAKI, T. OKAMOTO, K. Imou, T. Torii, Automatic diagnosis of plant disease, JOURNAL of
     the JAPANESE SOCIETY of AGRICULTURAL MACHINERY 61 (2010) 119–126. doi:10.11357/
     jsam1937.61.2_119.
 [8] T. Hussain, B. P. Singh, F. Anwar, A quantitative real time pcr based method for the detec-
     tion of phytophthora infestans causing late blight of potato, in infested soil, Saudi Journal of
     Biological Sciences 21 (2014) 380–386. URL: https://www.sciencedirect.com/science/article/pii/
     S1319562X13000892. doi:https://doi.org/10.1016/j.sjbs.2013.09.012.
 [9] J. Johnson, G. Sharma, S. Srinivasan, S. Masakapalli, S. Sharma, J. Sharma, V. Dua, Enhanced field-
     based detection of potato blight in complex backgrounds using deep learning, Plant Phenomics
     2021 (2021) 1–13. doi:10.34133/2021/9835724.
[10] P. Moallem, N. Razmjooy, A multi layer perceptron neural network trained by invasive weed
     optimization for potato color image segmentation, Trends in Applied Sciences Research 6 (2012)
     445–455. doi:10.3923/tasr.2012.445.455.
[11] N. E. Khalifa, M. Taha, L. Abou El-Magd, A. E. Hassanien, Artificial Intelligence in Potato
     Leaf Disease Classification: A Deep Learning Approach, 2021, pp. 63–79. doi:10.1007/
     978-3-030-59338-4_4.
[12] S. Radha, J. Chatterjee, N. Jhanjhi, S. Brohi, Performance of deep learning vs machine learning in
     plant leaf disease detection, Microprocessors and Microsystems 80 (2021) 103615. doi:10.1016/j.
     micpro.2020.103615.
[13] K. KC, Z. Yin, M. Wu, Z. Wu, Depthwise separable convolution architectures for plant disease
     classification, Computers and Electronics in Agriculture 165 (2019). doi:10.1016/j.compag.
     2019.104948.
[14] R. Karthik, M. Hariharan, S. Anand, P. Mathikshara, A. Johnson, R. Menaka, Attention embedded
     residual cnn for disease detection in tomato leaves, Applied Soft Computing 86 (2020) 105933.
[15] A. Khamparia, G. Saini, D. Gupta, A. Khanna, S. Tiwari, V. Albuquerque, Seasonal crops disease
     prediction and classification using deep convolutional encoder network, Circuits, Systems, and
     Signal Processing 39 (2020). doi:10.1007/s00034-019-01041-0.
[16] M. Islam, A. Dinh, K. Wahid, P. Bhowmik, Detection of potato diseases using image segmentation
     and multiclass support vector machine, 2017, pp. 1–4. doi:10.1109/CCECE.2017.7946594.
[17] A predictive machine learning application in agriculture: Cassava disease detection and classifica-
     tion with imbalanced dataset using convolutional neural networks, Egyptian Informatics Journal
     22 (2021) 27–34. doi:https://doi.org/10.1016/j.eij.2020.02.007.
[18] N. Deepa, N. Nagarajan, Kuan noise filter with hough transformation based reweighted linear
     program boost classification for plant leaf disease detection, Journal of Ambient Intelligence and
     Humanized Computing 12 (2021). doi:10.1007/s12652-020-02149-x.
[19] G. G., A. P. J., Identification of plant leaf diseases using a nine-layer deep convolutional
     neural network, Computers Electrical Engineering 76 (2019) 323–338. URL: https://www.
     sciencedirect.com/science/article/pii/S0045790619300023. doi:https://doi.org/10.1016/j.
     compeleceng.2019.04.011.
[20] Q. Liang, S. Xiang, Y. Hu, G. Coppola, D. Zhang, W. Sun, Pd2se-net: Computer-assisted plant
     disease diagnosis and severity estimation network, Computers and Electronics in Agriculture 157
     (2019) 518–529. doi:10.1016/j.compag.2019.01.034.
[21] D. Hughes, M. Salathe, An open access repository of images on plant health to enable the
     development of mobile disease diagnostics through machine learning and crowdsourcing (2015).
[22] P. Musa, F. Rafi, M. Lamsani, A review: Contrast-limited adaptive histogram equalization (clahe)
     methods to help the application of face recognition, 2018, pp. 1–6. doi:10.1109/IAC.2018.
     8780492.
[23] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition,
     arXiv 1409.1556 (2014).
[24] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for
     computer vision, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
     2016, pp. 2818–2826. doi:10.1109/CVPR.2016.308.
[25] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, 2016, pp. 770–778.
     doi:10.1109/CVPR.2016.90.
[26] V. Kukreja, A. Baliyan, V. Salonki, R. Kaushal, Potato blight: Deep learning model for binary and
     multi-classification (2021) 967–672. doi:10.1109/SPIN52536.2021.9566079.
[27] D. Hughes, M. Salathe, An open access repository of images on plant health to enable the
     development of mobile disease diagnostics through machine learning and crowdsourcing (2015).
[28] D. Tiwari, M. Ashish, N. Gangwar, A. Sharma, S. Patel, S. Bhardwaj, Potato leaf diseases detection
     using deep learning, 2020, pp. 461–466. doi:10.1109/ICICCS48265.2020.9121067.
[29] Y. Zhang, C. Song, D. Zhang, Deep learning-based object detection improvement for tomato
     disease, IEEE Access PP (2020) 1–1. doi:10.1109/ACCESS.2020.2982456.
[30] U. Barman, D. Sahu, G. G. Barman, J. Das, Comparative assessment of deep learning to detect
     the leaf diseases of potato based on data augmentation (2020). doi:10.1109/ComPE49325.2020.
     9200015.
[31] A. J. Rozaqi, A. Sunyoto, Identification of disease in potato leaves using convolutional neural net-
     work (cnn) algorithm, in: 2020 3rd International Conference on Information and Communications
     Technology (ICOIACT), 2020, pp. 72–76. doi:10.1109/ICOIACT50329.2020.9332037.