<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Optimizing Dehusked Arecanut Quality Segregation: CNN-Based Approach with Contrast Enhancement and Data Augmentation⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sameer Patil</string-name>
          <email>sameer@dmscollege.ac.in</email>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aparajita Naik</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marlon Sequeira</string-name>
          <email>marlon@unigoa.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sulaxana Vernekar</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jivan</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Electrical Engineering, Cambridge University</institution>
          ,
          <addr-line>Cambridge</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Electronics Programme, School of Physical and Applied Science, Goa University</institution>
          ,
          <addr-line>Goa</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>GVM'S GGPR College of Commerce and Economics</institution>
          ,
          <addr-line>Goa</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Karnataka)</institution>
          ,
          <addr-line>Tamil Nadu, Kerala, Assam</addr-line>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Research Supervisor, Electronics Programme, School of Physical and Applied Science, Goa University</institution>
          ,
          <addr-line>Goa</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>SCCTT-2024: International Symposium on Smart Cities</institution>
          ,
          <addr-line>Challenges, Technologies and Trends, 29th Nov 2024, Delhi</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the production process of Areca nut, the segregation stage is of prime importance. As of now, most commercial retailers use skilled workers for quality segregation, which means a lot of time is required for finalizing the product costing. Based on the inputs received from marketing executives, it was observed that if any method for automatic segmentation has to be meaningful, then the quality segregation should not have errors of more than 5% standard deviation. In this study, we propose a methodology based on 10fold cross-validation training of Convolutional Neural Network (CNN) using contrast enhancement and no data augmentation of the images. Also, in this paper, we compare the results attained on the quality segregation using numerous processing methods, for instance, data augmentation for images with and without cropping and also for images with and without contrast enhancement. The database developed here uses Areca nut cultivated in the Western Ghats region of the Indian Peninsula, particularly focused on the Konkan belt. In our paper, we achieved the lowest standard deviation of 4.1% for cropped images with contrast enhancement.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Areca nut</kwd>
        <kwd>Segregation</kwd>
        <kwd>Convolutional Neural Networks (CNN)</kwd>
        <kwd>Contrast Limited Adaptive Histogram Equalization (CLAHE)</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Areca palm (Areca catechu L.) is grown for its kernel, popularly known as Areca nut (or Betel nut
or Supari) in India. It is grown commercially along the western coast of India (Maharashtra, Goa,
fabrics, textile dyes and building materials. Hence, due to its high economic significance, Areca nut
has become an important cash crop.</p>
      <p>As per the latest studies, India tops at the global level, contributing to approximately 904 thousand
metric tons in 2020[6]. The top ten Areca nut producing countries over the globe are shown in Figure
1.1.</p>
      <p>Fig 1.1: Areca nut production in Asia Pacific in 2020 by country (in 1000 metric tons) [6]
The Areca nut kernel is hard from outside with the inner endosperm marbled in dark brown and
white [7]. The crucial steps in the areca nut production process are listed below.</p>
      <p>Harvesting
1. Drying
2. De-husking
3. Nut segregation based on its quality.</p>
      <p>Nut segregation is the most labor-intensive and time-consuming of the production process's
aforementioned steps. In Goa (India), Goa Bagayatdar, a cooperative organization, is a leading Areca
nut collector. At their collection centres, nuts are classified on the basis of texture, colour and the
quality. Here, nuts are segregated in seven different categories (Supari, Safed, Laal, Vench, Kharad,
Tukda and Baad) [8]. But, due to the shortage of skilled laborers for the above said work, it is essential
to develop a unit of segregation based on its quality. This will not only solve the issue of scarcity of
laborers but also will save farmer’s time.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature Review</title>
      <p>Much work is being done in machine learning and image processing to identify, categorize, and
grade agricultural products. S. Siddesha et al., in their study of the texture-based classification of
Areca nut, extracted different texture features using Wavelet, Gabor, Gray Level Difference Matrix,
Local Binary Pattern (LBP), and Gray Level Co-Occurrence Matrix features. The Nearest Neighbor
classifier was used to classify Areca nuts. A classification rate of 91.43% is achieved with Gabor
wavelet features [9]. Mallaiah Suresha et al. have proposed diseased and undiseased classification of
Areca nut using texture features of LBP, Haar Wavelets, GLCM, and Gabor. They achieved a 92.00%
success rate [10]. T. Liu et al. have tried to achieve automatic classification by extracting the color,
shape, and texture features of de-husked Areca nut [11]. Huang K.Y. used Image processing
techniques and Neural Networks for quality detection and classification of areca nuts. Six geometric
features, 3 color features, and defects were used for the classification process. This method of
classification attained an accuracy of 90.9% [12].</p>
      <p>Deep Learning (DL) approaches are increasingly important in machine learning because of their
high degrees of abstraction and capacity to automatically identify image patterns [13]. Convolutional
Neural Network (CNN) is the most frequently applied deep learning architecture for image
processing among the numerous designs employed [14,15,16]. Convolution operations are used by
CNN, a kind of Artificial Neural Network (ANN), in a minimum of one of its layers [14].</p>
      <p>To the best of our information, very little research has been done on the classification of dehusked
Areca nuts using CNN.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Data Acquisition Setup</title>
      <p>This paper deals with the quality classification of Areca nut from the Konkan belt of India,
particularly from the state of Goa. Since there are no publicly available database of the Areca nut
images, A unique setup was created to create an initial database. The setup consists of a top-mounted
camera with a sample table below at a distance of approximately 14cm. Surrounding the camera are
radially arranged 20 white LEDs evenly illuminating the sample. A hollow cylinder coated with black
paper on its inner sides is placed around the sample and camera to shield the stray light entering the
acquisition setup. The black paper prohibits light reflection from the inner walls and creates a glare
on the camera lens. The power source for the setup is an AC source of 220V, 50 Hz, which is then
converted to a DC constant current source coupled with a high voltage capacitor of 220 µF/ 450 V
connected in parallel to reduce flicker in the illumination. In this setup, we have used a 5MP
lightweight Pi camera module, which communicates with the Raspberry Pi 3 B+ board using the
MIPI camera serial interface protocol. At the base of the hollow cylinder, a black cloth is placed over
which an Areca nut whose image is to be acquired is kept for the reasons described above. Figure
2.1, shows the data acquisition setup designed for capturing images of Areca nuts.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <sec id="sec-4-1">
        <title>4.1. Convolutional Neural Network</title>
        <p>Convolutional Neural Networks (CNN) have been extensively studied in recent literature [17,18].
CNN is a class of deep learning algorithms that is incredibly efficient in classifying data by
recognizing patterns in an image. A CNN is a feed-forward network consisting of basic building
blocks like a convolutional layer, pooling layer, and activation layer, which are stacked with varying
permutations and combinations. This varying arrangement of convolutional layer, pooling layer,
and activation layer together form the feature extraction segment of a CNN [19]. Within the
classification segment, the extracted features are fed into the fully connected layer and the
classification layer [20]. The details of the various layers used in our custom CNN model are detailed
in Figure 3.1.</p>
        <p>It should be emphasized that all of the photos used in this study show the Areca nut from the top.
This is because, the very shape of Areca nut, which normally stabilises with its flat surface at the
bottom. Also, we wanted to study the accuracy of segregation based on the top view to design an
algorithm that will take reduced time for classification and thus increase the speed of segregation.
In this experiment, we have performed different image processing operations as detailed below to
get a better understanding of which operations will yield the best outcomes with the CNN network.
1. Areca Nut image has only been segmented and not cropped to a Region of Interest (ROI)
closest to its edges. This database is labeled as NoCrop_NoContrast.
2. The Areca Nut image has been cropped to ROI closest to its edge. This database is labelled
as Crop_NoContrast.
3. Areca Nut image has only been segmented and not cropped to a ROI closest to its edges and
has been contrast-enhanced using Contrast -limited Adaptive Histogram Equalization
(CLAHE). This database is labelled as NoCrop_Contrast.
4. Areca Nut image has been cropped to ROI closest to its edge and has been contrast-enhanced
using CLAHE. This database is labelled as Crop_Contrast.</p>
        <p>Thus, in this experiment, we are working with four distinct databases. The images of each
database have been illustrated in Figure 4.1below.</p>
        <p>(a) Uncropped and Segmented Areca nut image.</p>
        <p>(b) Uncropped and Segmented Areca nut
image with contrast enhancement using CLAHE
(c) Cropped Areca Nut image.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2 Contrast-limited Adaptive Histogram Equalization (CLAHE)</title>
        <p>CLAHE is an algorithm used to enhance the contrast between unprocessed images. It performs
histogram equalizations on non-overlapping sections of a given image and is called tiles. The
surrounding tiles are then blended using bilinear interpolation to prevent introducing false borders
[21].</p>
        <p>We have also tried to find the outcome of data augmentation with each database on the final
classification accuracy and the standard deviation. Therefore, with each database, we aimed to
determine the classification accuracy and standard deviation with CNN, using data augmentation
(with and without data augmentation).</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3 Data Augmentation</title>
        <p>Data augmentation is a technique in CNN, and is normally applicable, when the training samples
are limited. Thus, we can produce more training examples for a network by leveraging existing
images. This is accomplished by applying image processing techniques such as scaling, rotation
about an axis, translation, and reflection about an axis. This results in a significantly bigger training
sample size from the existing data [22].</p>
        <p>To evaluate our CNN, we use 10-fold cross-validation. In 10-fold cross-validation, the database is
split into 10 distinct folds, of which 9 folds will be used in training, and the 10th fold will be used for
testing. This means that each sample used for testing is now comprised of one in the training set,
and one from the training set is used for testing. Thus, the procedure is repeated 10 times, with every
iteration having a new fold from one of the 10 folds for testing [23].</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Analysis</title>
      <p>The current section presents the classification accuracy for all four databases. Specifically, in the
current section, we train a CNN with 10-fold cross-validation with and without data augmentation
for each database.
aANtouiognment CNN 92.86% 8%3.33 7%8.57 7%8.57 8%3.33 8%0.95 9%0.48 7%8.57 7%6.19 8%0.95 8%2.38 5.41%
aAtuiognment CNN 88.10% 6%1.90 5%7.14 7%3.81 6%6.67 6%1.90 4%2.86 7%3.81 7%1.43 5%2.38 6%5.00</p>
      <p>Table 5.3 shows the results of uncropped images with contrast enhancement. The results indicate
that these methods do not improve significantly over the earlier two methods, whose results are
listed in Table 5.1 and Table 5.2. The augmentation process give s the worst result, with a standard
deviation of more than 10%.
aANtouiognment CNN 76.19% 8%0.95 7%3.81 8%3.33 8%3.33 7%3.81 8%0.95 8%3.33 7%3.81 8%0.95 7%9.05
aAtuiognment CNN 71.43% 7%3.81 5%2.38 5%9.52 7%1.43 5%9.52 6%9.05 6%6.67 7%3.81 6%4.29</p>
      <p>The Table 5.4 gives the result of cropped images with contrast enhancement. Here, it may be seen
that no augmentation with our custom CNN model gives a standard deviation close to 4%, which
defends our claim that the top view can alone be used for the segr egation process. It may be noted
that the cropped image with no augmentation worked quite well, but it did not fare so well when
the augmentation process was utilized on the samples. All the above results and analysis have been
shown in the boxplot in Figure 5.1.
(a) Classification for uncropped images
with no contrast enhancement.
(b) Classification for cropped images
with no contrast enhancement.</p>
      <p>Here, Figure 5.1, as above, shows the Boxplot of the results and analysis of the image
classification.</p>
      <p>From the boxplot (a), it may be seen that, in the case of no augmentation, the accuracy is close to
80% for most trials. Whereas, in augmentation, it widely varies with the least going almost close to
50%, which is not desirable</p>
      <p>The boxplot (b) also has accuracy for both models (for cropped and no contrast enhancement)
varying widely from 66% to 84%, therefore casting doubt on the process of classification. The same
is true in augmented images, wherein the accuracy varies from 64% to 84%.</p>
      <p>As discussed in boxplot (b), boxplot (c) (for uncropped and contrast enhancement) also has a
similar behavior wherein the accuracy varies widely over 75% to 95% (for no augmentation) and 45%
to 90% (for augmentation). Thus, signifying that they are not consistent.</p>
      <p>In boxplot (d), the accuracy for non-augmented images is centered around 82% with a small
deviation from 74% to 84% for our custom CNN model. Thus, suggesting this method is more reliable
for classification of Areca nuts. However, the same is not true in case of augmentation. The accuracy
in the case of augmentation varies from 50% to 75%.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In the above article, we have carried out four diverse classification methods based on 10-fold
cross-validation training of a custom CNN model using contrast enhancement and data
augmentation of the images. The results indicate that using the custom CNN model, the classification
method using no augmentation and contrast enhancement for cropped images, has yielded the best
outcomes with a standard deviation of less than 5%. The standard deviation of less than 5% is a
significant number for agriculturalists for the segregation of Areca nuts, considering we have used
only the top view. Single-image segregation can greatly increase the speed of segregation; thus, the
payments to the farmers can be given on the spot, and the loss of revenue due to human fatigue can
be reduced. The above experiments suggest that the algorithm can be implemented and a machine
can be manufactured to segregate Areca nuts automatically. This article provides concept validation
for the manufacturing of automated Areca segregation units.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgement</title>
      <p>Authors acknowledge the help extended by the skilled segregators for classification and officials
of Goa Bagayatdar for providing many samples of Areca nut used for this work.</p>
    </sec>
    <sec id="sec-8">
      <title>Statements and Declarations</title>
      <sec id="sec-8-1">
        <title>Data Availability</title>
        <p>The data that support the findings of this study are available from the corresponding author,
upon reasonable request.</p>
      </sec>
      <sec id="sec-8-2">
        <title>Funding and/or Conflicts of Interests/Competing Interests</title>
        <p>Funding: The authors declare that they have not availed any funding from any agency for the
above-carried work.</p>
        <p>Conflict of Interest: The authors declare that they have no conflict of interest.
[4] A. Kumar et al., Assessment of areca nut use, practice and dependency among people in
Guwahati, Assam: a cross-sectional study, ecancer, vol. 15, (2021), doi:
10.3332/ecancer.2021.1198.
[5] M. S. Amudhan, Begum V Hazeena, and H. K. B. , A REVIEW ON PHYTOCHEMICAL AND
PHARMACOLOGICAL POTENTIAL OF ARECA CATECHU L. SEED, IJPSR, vol. 3, no. 11, pp.
4151–4157,(2012),
https://www.researchgate.net/publication/264710991_A_review_on_phytochemical_and_phar
macological_potential_of_Areca_catechu_L_Seed
[6] APAC: areca nut production by country 2022, Statista. (2024).</p>
        <p>https://www.statista.com/statistics/657902/asia-pacific-areca-nut-production-by-country/
[7] V. Raghavan and H. K. Baruah, Arecanut: India’s popular masticatory —history, chemistry and
utilization, Econ Bot, vol. 12, no. 4, 315–345, Oct. 1958, doi: 10.1007/BF02860022.
[8] Goa Bagayatdar Bazar – One-stop-shop for all. (2024). https://goabagayatdar.com/
[9] S. Siddesha, S. K. Niranjan, and V. N. Manjunath Aradhya, Texture based classification of
arecanut, in 2015 International Conference on Applied and Theoretical Computing and
Communication Technology (iCATccT), (2015), 688–692. doi: 10.1109/ICATCCT.2015.7456971.
[10] S. Mallaiah, Ajit Danti, and N. S. K , Classification of Diseased Arecanut based on Texture
Features , International Journal of Computer Applications, vol. NCRAIT 3, 1–6, (2014), [Online].</p>
        <p>Available: /proceedings/ncrait/number3/15152-1419/
[11] T. Liu, J. Xie, Y. He, M. Xu, and C. Qin, An automatic classification method for betel nut based
on computer vision, International Conference on Robotics and Biomimetics (ROBIO), 1264–1267,
(2009), doi: 10.1109/ROBIO.2009.5420823.
[12] K.-Y. Huang, Detection and classification of areca nuts with machine vision, Computers &amp;</p>
        <p>Mathematics with Applications, vol. 64, no. 5, 739–746, (2012), doi: 10.1016/j.camwa.2011.11.041.
[13] J. Naranjo-Torres, M. Mora, R. Hernández-García, R. J. Barrientos, C. Fredes, and A. Valenzuela,
A Review of Convolutional Neural Network Applied to Fruit Image ProcessingA,pplied Sciences,
vol. 10, no. 10, 3443, (2020), doi: 10.3390/app10103443.
[14] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. in Adaptive computation and machine
learning. Cambridge, Massachusetts: The MIT Press, (2016).
[15] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 521, no. 7553, 436–444, (2015),
doi: 10.1038/nature14539.
[16] M. D. Zeiler and R. Fergus, Visualizing and Understanding Convolutional Networks, in
Computer Vision – ECCV 2014, vol. 8689, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds.,
Cham: Springer International Publishing, (2014), 818–833. doi: 10.1007/978-3-319-10590-1_53.
[17] W. Jia, Y. Tian, R. Luo, Z. Zhang, J. Lian, and Y. Zheng, Detection and segmentation of
overlapped fruits based on optimized mask R-CNN application in apple harvesting robot,
Computers and Electronics in Agriculture, vol. 172, 105380, (2020), doi:
10.1016/j.compag.2020.105380.
[18] X. Mai, H. Zhang, X. Jia, and M. Q.-H. Meng, Faster R-CNN With Classifier Fusion for Automatic
Detection of Small Fruits, IEEE Trans. Automat. Sci. Eng., 1–15, (2020), doi:
10.1109/TASE.2020.2964289.
[19] M. Khoshdeli, R. Cong, and B. Parvin, Detection of nuclei in H&amp;E stained sections using
convolutional neural networks, in 2017 IEEE EMBS International Conference on Biomedical &amp;
Health Informatics (BHI), Orland, FL, USA: IEEE, (2017), 105–108. doi: 10.1109/BHI.2017.7897216.
[20] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional
neural networks, Commun. ACM, vol. 60, no. 6, 84–90, (2017), doi: 10.1145/3065386.
[21] S. Aboshosha, O. Zahran, M. I. Dessouky, and F. E. Abd El-Samie, Resolution and quality
enhancement of images using interpolation and contrast limited adaptive histogram
equalization, Multimed Tools Appl, vol. 78, no. 13, 18751–18786, (2019), doi:
10.1007/s11042-0187022-1.
[22] Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, Random Erasing Data Augmentation, AAAI,
vol. 34, no. 07, 13001–13008, (2020), doi: 10.1609/aaai.v34i07.7000.
[23] T.-T. Wong and P. -Y. Yeh, Reliable Accuracy Estimates from k -Fold Cross Validation, IEEE
Trans. Knowl. Data Eng., vol. 32, no. 8, 1586–1594, (2020), doi: 10.1109/TKDE.2019.2912815.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Areca</surname>
            <given-names>Nut -</given-names>
          </string-name>
          <article-title>an overview | ScienceDirect Topics</article-title>
          . (
          <year>2024</year>
          ). https://www.sciencedirect.com/topics/neuroscience/areca-nut
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Origin | Arecanut.</surname>
          </string-name>
          (
          <year>2024</year>
          ). https://arecanut.org/arecanut-1/origin/
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Areca</surname>
            <given-names>nut</given-names>
          </string-name>
          , Wikipedia. (
          <year>2024</year>
          ). https://en.wikipedia.org/w/index.php?title=Areca_nut&amp;oldid=
          <fpage>1247804910</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>