=Paper=
{{Paper
|id=Vol-2267/200-206-paper-37
|storemode=property
|title=Architecture and basic principles of the multifunctional platform for plant disease detection
|pdfUrl=https://ceur-ws.org/Vol-2267/200-206-paper-37.pdf
|volume=Vol-2267
|authors=Pavel Goncharov,Gennady Ososkov,Andrey Nechaevskiy,Alexander Uzhinskiy
}}
==Architecture and basic principles of the multifunctional platform for plant disease detection==
<pdf width="1500px">https://ceur-ws.org/Vol-2267/200-206-paper-37.pdf</pdf>
<pre>
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018


    ARCHITECTURE AND BASIC PRINCIPLES OF THE
   MULTIFUNCTIONAL PLATFORM FOR PLANT DISEASE
                   DETECTION
        Pavel Goncharov 2, Gennady Ososkov 1, Andrey Nechaevskiy 1,
                           Alexander Uzhinskiy 1
                                    1
                                        Joint Institute for Nuclear Research
                         2
                             Sukhoi State Technical University of Gomel, Belarus.

                                           E-mail: auzhinskiy@jinr.ru


Increasing number of smartphones and advances in deep learning field opens new opportunities in the
crop disease detection. The aim of our research is to develop a multifunctional platform that uses
modern organization and deep learning technologies to provide a new level of service to farmer’s
community. Like the product, we are going to develop a mobile application allowing users to send
photos and text description of sick plants and get the cause of the illness and its treatment. We have
collected a special database of grape leaves consisting of five set of images. We have reached 99%
accuracy in the disease detection with a deep siamese convolutional network. We have developed a
web-portal with a basic detection functionality and provided an opportunity to download our self-
collected image database. Concepts, an architecture of the plant diseases detection platform and a
description of the used deep learning models and techniques are presented.

Keywords: machine learning, statistical models, siamese networks, plant disease detection, transfer
learning

                      © 2018 Pavel Goncharov, Gennady Ososkov, Andrey Nechaevskiy, Alexander Uzhinskiy


                                                                                                        200
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018


1. Introduction
        Quality of available data about an impact of plant diseases is variable, patchy and often
missing, particularly for smallholders, who produce the majority of the world’s food. There is an
opinion that crop losses by diseases are between 20 and 40% [1]. It is clear that plant diseases are a
serious threat to the well-being of rural families, to the economy and governments, and to food
security worldwide. The aim of our research is to facilitate the detection and preventing diseases of
agricultural plants by both deep learning and programming services. We have created the plant disease
detection platform (PDDP) from the scratch and share our experience. Below we describe the used
deep learning models and techniques, also we give the link to our open image database.
        There are many kinds of research in which deep learning was used to identify plant diseases.
Some of them report about great detection level, more than 90%. However, there is a lack of a real
application or open databases to reproduce the experiments. Probably, the most famous mobile
application for plants disease detection is Plantix [2]. The application is developed by PEAT, a
German-based AgTech startup. Currently, Plantix can detect more than 300 diseases. Plantix image
database is closed and we cannot find any information about technologies used in it for disease
detection. We made a small experiment processing different types of images from our self-collected
database with the help of Plantix. It allows us to conclude that Plantix identification of the plant type is
rather good: 64 of 70 images were recognized as grapes. At the same time, the disease detection ability
is rather limited. Only a few images were detected correctly. Less than 10% of images had a right
disease at the top of suggestions. Perhaps, our dataset does not match some requirements of the Plantix
application. We used original images from the Internet and preprocessed ones in which problems were
obvious, but the result was quite similar.
        We have to reach the same functionality as Plantix but with better accuracy in the disease
detection, thus we had to find an image database and a good statistical model.


2. First experiments
        We considered different models that were used in the related works to understand what is the
best option. In [3] authors reached high accuracy in detection up to 99.7% on a held-out test set.
However, results on the real-life images were quite unsatisfactory, about 30% only. The authors have
used PlantVillage [4] well known public database of 86,147 images of diseased and healthy plant
leaves. It was only one public database, and we believed that we could improve their results.
Eventually, we have prepared the small test set of 256x256 pixel images consisting of healthy leaves,
and images with Esca, Chlorosis and Black Rot diseases and started to work.
        We applied the transfer-learning approach to train a deep classifier on the Plant Village images
and then evaluated it on a test subset of images. To find the most appropriate pretrained network for
the transfer learning we compared four models, which weights were formed to solve the ILSVRC
2015 (ImageNet Large Scale Visual Recognition Challenge). They were VGG19 [5], InceptionV3 [6],
ResNet50 [7] and Xception [8].
        The comparative scheme of each classifier is to compose all layers of the trained networks
except the final classification layer and to add the global average pooling operation on the top of each
base network to reduce the spatial dimensions of a three-dimensional output tensor. Further, we
appended a densely connected layer with 256 rectified neurons with dropout having rate of 0.5. At the
end of such network, the softmax classification layer was utilized.
        We froze all layers in the base networks and trained only last three layers using the stochastic
gradient descent (SGD) with the learning rate equals to 5e-3, momentum 0.9 and the weight decay
with the value of 5e-4 for 50 epochs. The best result of the classification accuracy with the value of
99.4% on a test subset of the PlantVillage dataset was obtained using ResNet50 architecture. We
applied this model to deduce the classification efficiency on a test subset of images collected from the
Internet. The obtained results were very poor – 48% accuracy.
        Supposing that pretrained on the ImageNet dataset network does not extract meaning features
from leaves images, we decided to unfreeze more layers. We unfroze all layers except first 140 in the
base network and trained the remaining 39 defrozen layers with the help of Adam optimizer [9] with

                                                                                                        201
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018


the learning rate equals to 5e-5 and the weight decay with the value of 1e-6 for 30 epochs. Next, we
proposed to apply a strong data augmentation by adding random transformations such as shifts,
rotations, zooming etc., because the classification network overfits when we train more than 30
epochs. Also, we supposed that only a central part of the leaf is required to recognize disease and we
tried to expand our dataset by using only parts of initial images. Some other experiments were done
with background modification and other optimization, but they improved the accuracy only a little.
         We believe that the problem lies in the nature of the used images. PlantVillage photos were
collected and processed under special controlled conditions, so they are rather synthetic and differ
from real-life images, see fig 1.


Figure 1. Top three photos are from the PlantVillage database. Bottom photos are real-life photos of sick leaves
        This proves that if we want a good result, we need a real-life database. We collected our own
database of the grape leaves images from open source, then reduced their size and extracted only
meaningful parts. At that time, we have a set of 256x256 pixel images consi sting of 130 healthy
leaves, and 30 – 70 images with Esca, Chlorosis and Black Rot diseases. The number of images is
very small so we have to try some new approaches.


3. Siamese networks
         How could we get good features from a very small amount of data? We addressed this
problem to the so-called one-shot approach offering a solution by siamese neural networks [10-13].
The siamese network consists of twin networks joined by the similarity layer with the energy function
at the top. Weights of twins are tied, thus the result is invariant and in addition guarantees that very
similar images cannot be in very different locations in a feature space, because each network computes
the same function. The similarity layer determines some distance metric between the so-called
embeddings, i.e. the high-level feature representations of input pairs of images. Training on the pairs
means that there are quadratically many possible samples to train the model on, thus making it hard to
overfit. We can easily compute the number of possible pairs using the combinatorics formula of k-
combinations. Thereby, for the smallest class with 31 images (Black rot) we have 1860 pairs.
         Our classification model is the siamese network with convolutional twins, which tie weights
between themselves. Each of twins processes one image from input pair of samples to extract a vector
of high-level features. Then, a pair of embeddings is passed through the lambda layer, actually
computing the simple element-wise L1-distance (Figure 2). We present the connection a single
sigmoid neuron to the distance layer thus it becomes possible to train the model with binary cross-
entropy loss.
         We use exclusively rectified linear (ReLU) units in the twins except for the last densely-
connected layer with sigmoid activations. The initial number of convolutional filters is 32 and
doubling each layer. After the last pooling layer, we added a flatten operation to squeeze the
convolutional features into a vector followed by a densely-connected layer with 1024 sigmoid
neurons.


                                                                                                            202
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018


       We also experimented with adding the L2 regularization to each of the convolutional layers
but have not received a visible effect. In addition, we have been trying to vary the size of the
embedding layer from 256 to 4096. The best architecture we created is presented in figure 2.


  Figure 2. Our best siamese convolutional architecture. «Conv» means the convolutional operation, «BN» is a
                batch normalization, «32 @ 123x123» – 32 feature maps with the particular size


         We used the K-nearest neighbors (KNN) algorithm to solve the classification task in a test
phase. We set the K parameter to 1 nearest neighbor, so it is equivalent to one-shot learning task,
except one moment – we utilize all training data as a support set instead of the random picking from
the dataset. For the distance metric, we have used the absolute distance in the lambda layer of the
siamese network, we preferred to manage the manhattan distance.
         We apply the data augmentation by adding rotations in range of 75 degrees, random shifts in
all dimensions including a shift within a channel in the range of 0.2. In addition, we use little zooming,
vertical and horizontal flips. We have trained the siamese network with the proposed augmentation for
100 batches size 32 per one training epoch for 35 epochs. As the method of optimization, we used
Adam with the learning rate value of 0.0001. For the loss function, we utilized the binary cross-
entropy loss.
         After training the model, we left only the encoder network represented as «shoulder» of the
one-shot model, or so-called twin. We further used this part of the network as a feature extractor. After
that, we took the training subset with five classes of images: Healthy, Esca, Black rot, Chlorosis, and
Mildew. We split data on train and test with the ratio near 75:25. Then we passed these sets of images
through our feature extractor to obtain data embeddings. The training subset of images is then utilized
as training data for the KNN. For verification, we used the remaining test set. The classification
accuracy was up to 94%. Besides, we tried to mix the real-life data with the images from the
PlantVillage database within train and test subsets. The Siamese networks allow generalizing input
data to the latent high-dimensional embedding space, even for unseen classes. The obtained accuracy
with the value of 92% (our best result) proves this consideration.
         We used the prepared embeddings to train the T-SNE method [14], which is the common
technique to visualize high-dimensional data. We extracted two components to plot them in 2D space
(Figure 3). One can see that there are five separate clusters – one per each class. We have signed each
cluster with the name of a particular class.
         There are a few points, which wrongly got into the different set, we investigated such cases
and slightly modified the images. For example, one healthy leave on the white wall was at the mildew
cluster, so we removed the background influence of the image, and it moved to the healthy cluster.
After that, we trained model again and got 99% accuracy.


                                                                                                          203
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018


            Figure 3. T-SNE visualization of the high-level features extracted by the siamese twin


3. Architecture and principles of the platform
        After finishing the creation of our base network, we moved to other parts of the platform. We
agreed on the architecture presented in Figure 4.


                                     Figure 4. Architecture of the PDDP
        The PDDP consists of a set of interconnected services and tools developed, deployed and
hosted with the help of the JINR cloud infrastructure. Our web-portal (pdd.jinr.ru) was developed with
the Node.js (Sail.js). It provides not only a web-interface but also the API for third-party services. We
have the TensorFlow model in the Docker realized as a service. The model can work at the virtual
server, or at a GPU cluster. Right now, we are storing images directly on the local drive, but if their
number increases dramatically, we will use cloud storage like disk.jinr.ru. We will use the Apache
Cordova to create the Mobile App for Android, IOS and Windows platforms.
        We also determined basic roles and use-cases for PDDP. Users can: send photos and a text
description of sick plants through the web interface, or mobile application and get the cause of the

                                                                                                        204
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018


illness; browse through disease’s description and galleries of ill plants; verify that the requested
disease was recognized right and the treatment helps. Experts can: browse users’ requests and verify
the recognition; request an addition of their image, or image from the user complain to DB; request an
change in the diseases description; request a retraining of the model with new images. Researchers
can: work with images database through the web-interface, or API; download all or only a part of the
base; obtain the API-key to submit recognition tasks to the platform. Supervisors can: add new images
to the database; initiate a retraining of the model; get different statistical metrics about portal users.
         Currently, we have a website with general information – pdd.jinr.ru; open image database;
model running in docker container at the virtual server. One has the ability to submit disease detection
jobs at the website and get results, see Figure 5


                        Figure 5. Results of the grape diseases detection at pdd.jinr.ru


4. Conclusion and plans
        It is not enough to have a lot of images to recognize diseases rightly. Quality of the images
database is extremely important for results of detections. The Siamese neural networks are very
perspective research field for the plant disease detection projects. We have reached 99% accuracy in
the detection of some grape diseases. We will keep on developing our web-portal, and we are going to
present a draft mobile application in the second half of 2019.


Acknowledgment
        The reported study was funded by RFBR according to the research project № 18-07-00829


References
[1] Serge Savary & Andrea Ficke & Jean-Noël Aubertot & Clayton Hollier, Crop losses due to
diseases and their implications for global food production losses and food security// Springer Food
Security, Vol 4, 2012
[2] Plantix project home page [Electronic resource]: https://plantix.net. (Accessed 1.10.2018)
[3] Mohanty S.P., Hughes D.P., Salathé M.: Using Deep Learning for Image-Based Plant Disease
Detection, Frontiers in Plant Science 7. Article: 1419 (2016)

                                                                                                        205
Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and
             Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018


[4] PlantVillage project home page [Electronic resource]: https://plantvillage.psu.edu/. (Accessed
1.10.2018)
[5] Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition
//arXiv preprint arXiv:1409.1556. – 2014.
[6] Szegedy C. et al. Rethinking the inception architecture for computer vision //Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition. – 2016. – p. 2818-2826.
[7] He K. et al. Deep residual learning for image recognition //Proceedings of the IEEE conference on
computer vision and pattern recognition. – 2016. – p. 770-778.
[8] Chollet F. Xception: Deep learning with depthwise separable convolutions //arXiv preprint. –
2016.
[9] Kingma D.P., Ba J. Adam: A method for stochastic optimization //arXiv preprint arXiv:1412.6980.
– 2014.
[10] Koch G., Zemel R., Salakhutdinov R. Siamese neural networks for one-shot image recognition
//ICML Deep Learning Workshop. – 2015. – Т. 2.
[11] Taigman Y. et al. Deepface: Closing the gap to human-level performance in face verification
//Proceedings of the IEEE conference on computer vision and pattern recognition. – 2014. – p. 1701-
1708.
[12] Hadsell R., Chopra S., LeCun Y. Dimensionality reduction by learning an invariant mapping
//Computer vision and pattern recognition, 2006 IEEE computer society conference on. – IEEE, 2006.
– Т. 2. – p. 1735-1742.
[13] Appalaraju S., Chaoji V. Image similarity using Deep CNN and Curriculum Learning //arXiv
preprint arXiv:1709.08761. – 2017.
[14] Maaten L., Hinton G. Visualizing data using t-SNE //Journal of machine learning research. –
2008. – Т. 9. – №. Nov. – С. 2579-2605.


                                                                                                        206

</pre>