<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Design and Development of Machine Learning Model for Crop Yield Prediction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Taman Kumar</string-name>
          <email>tamankumar0808@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kiran Jyoti</string-name>
          <email>kiranjyoti@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sandeep K.Singla</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guru Nanak Dev Engineering College</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ludhiana (GNDEC)</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Panjab</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>India</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>International Conference on Emerging Technologies: AI, IoT, and CPS for Science &amp; Technology Applications</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Agriculture is one of the major sources of employment as well as contributor in the GDP of India. Machine learning is the latest technology which can be used to help the agriculture sector. This paper will focus in using the machine learning technique to predicting the wheat crop yield. The regression algorithms which are used in it are simple regression, gradient booster, polynomial regression and random forest. The results of every algorithm are compared with actual results in the last.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Crop yield prediction</kwd>
        <kwd>machine learning</kwd>
        <kwd>regression</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>weather, soil, use of fertilizers, and seed variety. This indicates that crop yield prediction is not a
trivial task; instead, it consists of several complicated steps. Nowadays, crop yield prediction models
can estimate the yield, but a better performance in yield prediction is still desirable (Klompenburg et
al. 2020).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature</title>
      <p>A. Agricultural Information Extraction
1) Raorane and Kulkarni (2015), used datamining tools in crop management system. They
used regression algorithms. The disadvantage is the model is not specified.
2) Kushwaha and Bhattachrya (2015), concluded the method which is helpful in finding the
suitable crop according to the land. Agro algorithm is used in this paper.
3) Santra et al. (2016), used artificial neural network, decision tree algorithm and regression
analysis to providing the information of crops and help in increasing the yield rate. The
negative is method is not clearly specified.</p>
      <sec id="sec-2-1">
        <title>B. Crop Yield Estimation</title>
        <p>1) Kumar et al. (2015), suggested the method which is helpful in improving the yield of crops.</p>
        <p>Classifications are used and the parameters are compared. The demerit is the accuracy and
performance is not proper.
2) Babu and Babu (2016),gave method which provide solutions to some farming problems such
as water and fertilizers. They have also used the agro algorithm and the accuracy is also the
problem in it.
3) Jain et al. (2017), in their paper found the better sequence according to which the crops
should be sown so that the maximum yield is extracted. Not only sequence they also used
machine learning for irrigation and crop diseases.
4) Djodiltachoumy (2017), used K means algorithms (Clustering) on previous years data and
predict yield according to that database. The demerit is they used fewer amounts of data and it
is suitable only for association rule.
5) Nigam et al. (2019), have concluded the random forest regression gives the highest yield
prediction accuracy. Simple recurrent neural network performs better on rainfall prediction
while LSTM is good for temperature prediction.</p>
        <p>C. Machine Learning Algorithms
1) Khairunniza-Bejo et al. (2014), defined a method using Artificial Neural Network to help
the farmers solving some of their problems. The disadvantage is the proposed method is very
time consuming.
2) Ramesh and Vardhan (2015),used multiple linear regression method to analyze and verify
the database. The demerit is this method is of less accuracy.
3) Savla et al. (2015), suggested the framework using Normalization, Clustering and</p>
        <p>Classification to understand the crop yield rate zones based on attributes.
4) Sindhura et al. (2016), also used multiple linear regression methods to predict and support
the decision making in many sectors.</p>
        <p>The comprehensive study of literature review revealed that the crop yield estimation and
agricultural information extraction from the ancillary data as well as historical data is an open
problem. Various machine learning models and other algorithms have been used in past for the yield
estimation.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>Collected data is preprocessed. There were some ‘NA’ values which are filled by taking
average value of the above and below column.</p>
      <p>Feature selection is applied to extract important parameters for modeling framework. A
process to find correlation between all the parameters is applied and the parameters which
were not affecting the crop yield are eliminated. Image of correlation is given below:</p>
      <sec id="sec-3-1">
        <title>B. Output</title>
        <p>1. Results of applied machine learning algorithms are compared to evaluate the model. The table
of results are given below in table 2:
2. The representation of all the predicted values and actual values from year 2011 to 2018 is also
given below in line and bar graph:</p>
        <p>3. The table of performance evolution measures such as Mean Absolute Error, Mean Squared
Error, Root Mean Squared Error and Mean Absolute Percentage Error of applied algorithm is
given below in table 3:</p>
        <sec id="sec-3-1-1">
          <title>Mean Absolute Error</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Mean Squared Error</title>
        </sec>
        <sec id="sec-3-1-3">
          <title>Root Mean Squared</title>
        </sec>
        <sec id="sec-3-1-4">
          <title>Error</title>
        </sec>
        <sec id="sec-3-1-5">
          <title>Mean Absolute Percentage Error</title>
          <p>709.744
597,836.8
13
773.199
0.144
570.942
440,162.5
65
663.447
0.115</p>
        </sec>
        <sec id="sec-3-1-6">
          <title>Simple</title>
        </sec>
        <sec id="sec-3-1-7">
          <title>Regression</title>
        </sec>
        <sec id="sec-3-1-8">
          <title>Polynomial</title>
          <p>Regression
583.375
452,351</p>
          <p>.375
672.571
0.118
986.935
1,102,9
40.261
1050.21
0.202
4. Accuracy of applied models is given below in table 4:</p>
          <p>Table 4. Accuracy of Applied Models</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future work</title>
      <p>From the results it is clearly shown that Gradient booster gives the maximum accurate results.
The results are obtained currently using the Knime software but our future work is to develop an
application so that the farmers can operate it easily.</p>
    </sec>
    <sec id="sec-5">
      <title>5. References</title>
      <p>473.
[14] Raorane, A. A., and R. V. Kulkarni. "Application of DataMining tool to crop management
system." Russian Journal of Agricultural and Socio-Economic Sciences 37, no. 1 (2015).
[15] Rajak, Rohit Kumar, AnkitPawar, MitaleePendke, PoojaShinde, Suresh Rathod, and
AvinashDevare. "Crop recommendation system to maximize crop yield using machine learning
technique." International Research Journal of Engineering and Technology 4, no. 12 (2017):
950-953.
[16] Savla, Anshal, Himtanaya Bhadada, Parul Dhawan, and Vatsa Joshi. "Application of machine
learning techniques for yield prediction on delineated zones in precision agriculture." IJNCAA
(2015): 48
[17] Son, Nguyen-Thanh, Chi-Farn Chen, Cheng-Ru Chen, Horng-Yuh Guo, Youg-Sing Cheng,
ShuLing Chen, Huan-Sheng Lin, and Shih-Hsiang Chen. "Machine learning approaches for rice crop
yield predictions using time-series satellite data in Taiwan." International Journal of Remote
Sensing 41, no. 20 (2020): 7868-7888.
[18] D. Sindhura, B. Navya Krishna, K. Sai Prasanna Lakshmi, B. Mallikarjun Rao, Dr. J Rajendra
Prasad, Effects of Climate Changes on Agriculture International Journal of Advanced Research
in Computer Science and Software Engineering,2016.
[19] Van Klompenburg, Thomas, Ayalew Kassahun, and Cagatay Catal. "Crop yield prediction using
machine learning: A systematic literature review." Computers and Electronics in Agriculture 177
(2020): 105709.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Giri</surname>
          </string-name>
          , and
          <string-name>
            <surname>Dr</surname>
            <given-names>G. Anjan</given-names>
          </string-name>
          <string-name>
            <surname>Babu</surname>
          </string-name>
          .
          <article-title>"Big Data Analytics to Produce Big Results in the Agricultural Sector</article-title>
          .
          <source>"</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Djodiltachoumy</surname>
            ,
            <given-names>S. "</given-names>
          </string-name>
          <article-title>A Model for Prediction of Crop Yield."</article-title>
          <source>International Journal of Computational Intelligence and Informatics</source>
          <volume>6</volume>
          , no.
          <issue>4</issue>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Ghadge</surname>
            , Rushika, Juilee Kulkarni, Pooja More, Sachee Nene, and
            <given-names>R. L.</given-names>
          </string-name>
          <string-name>
            <surname>Priya</surname>
          </string-name>
          .
          <article-title>"Prediction of crop yield using machine learning</article-title>
          .
          <source>" Int. Res. J. Eng. Technol.(IRJET) 5</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Jui-Chan</surname>
          </string-name>
          ,
          <string-name>
            <surname>Kuo-Min</surname>
            <given-names>Ko</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Hung Shu</surname>
          </string-name>
          , and
          <string-name>
            <surname>Bi-Min Hsu</surname>
          </string-name>
          .
          <article-title>"Application and comparison of several machine learning algorithms and their integration models in regression problems</article-title>
          .
          <source>" Neural Computing and Applications</source>
          <volume>32</volume>
          , no.
          <volume>10</volume>
          (
          <year>2020</year>
          ):
          <fpage>5461</fpage>
          -
          <lpage>5469</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Jain</surname>
            , Nishit, Amit Kumar, Sahil Garud, Vishal Pradhan, and
            <given-names>Prajakta</given-names>
          </string-name>
          <string-name>
            <surname>Kulkarni</surname>
          </string-name>
          .
          <article-title>"Crop selection method based on various environmental factors using machine learning</article-title>
          .
          <source>" International Research Journal of Engineering and Technology (IRJET) 4</source>
          , no.
          <issue>2</issue>
          (
          <year>2017</year>
          ):
          <fpage>1530</fpage>
          -
          <lpage>1533</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Kale</surname>
          </string-name>
          , Shivani S., and
          <string-name>
            <surname>Preeti</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Patil</surname>
          </string-name>
          .
          <article-title>"A Machine Learning Approach to Predict Crop Yield</article-title>
          and
          <string-name>
            <given-names>Success</given-names>
            <surname>Rate</surname>
          </string-name>
          .
          <article-title>"</article-title>
          <source>In 2019 IEEE Pune Section International Conference (PuneCon)</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          . IEEE,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Khairunniza-Bejo</surname>
            , Siti,
            <given-names>Samihah</given-names>
          </string-name>
          <string-name>
            <surname>Mustaffha</surname>
          </string-name>
          , and Wan Ishak Wan Ismail.
          <article-title>"Application of artificial neural network in predicting crop yield: A review."</article-title>
          <source>Journal of Food Science and Engineering</source>
          <volume>4</volume>
          , no.
          <issue>1</issue>
          (
          <year>2014</year>
          ):
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Kumar</surname>
            , Rakesh,
            <given-names>M. P.</given-names>
          </string-name>
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>Prabhat</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
            , and
            <given-names>J. P.</given-names>
          </string-name>
          <string-name>
            <surname>Singh</surname>
          </string-name>
          .
          <article-title>"Crop Selection Method to maximize crop yield rate using machine learning technique." In 2015 international conference on smart technologies and management for computing, communication, controls, energy and materials</article-title>
          (ICSTM), pp.
          <fpage>138</fpage>
          -
          <lpage>145</lpage>
          . IEEE,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Kushwaha</surname>
            ,
            <given-names>Ashwani</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
            , and
            <given-names>Sweta</given-names>
          </string-name>
          <string-name>
            <surname>Bhattachrya</surname>
          </string-name>
          .
          <article-title>"Crop yield prediction using Agro Algorithm in Hadoop."</article-title>
          <source>International Journal of Computer Science and Information Technology &amp; Security (IJCSITS) 5</source>
          , no.
          <issue>2</issue>
          (
          <year>2015</year>
          ):
          <fpage>271</fpage>
          -
          <lpage>274</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Medar</surname>
            , Ramesh, Vijay S. Rajpurohit, and
            <given-names>Shweta</given-names>
          </string-name>
          <string-name>
            <surname>Shweta</surname>
          </string-name>
          .
          <article-title>"Crop yield prediction using machine learning techniques."</article-title>
          <source>In 2019 IEEE 5th International Conference for Convergence in Technology (I2CT)</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          . IEEE,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Mishra</surname>
            , Subhadra,
            <given-names>Debahuti</given-names>
          </string-name>
          <string-name>
            <surname>Mishra</surname>
          </string-name>
          , and Gour Hari Santra.
          <article-title>"Applications of machine learning techniques in agricultural crop production: a review paper."</article-title>
          <source>Indian Journal of Science and Technology</source>
          <volume>9</volume>
          , no.
          <volume>38</volume>
          (
          <year>2016</year>
          ):
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Nigam</surname>
            , Aruvansh, Saksham Garg, Archit Agrawal, and
            <given-names>Parul</given-names>
          </string-name>
          <string-name>
            <surname>Agrawal</surname>
          </string-name>
          .
          <article-title>"Crop yield prediction using machine learning algorithms</article-title>
          .
          <source>" In 2019 Fifth International Conference on Image Information Processing (ICIIP)</source>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>130</lpage>
          . IEEE,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <given-names>B. Vishnu</given-names>
            <surname>Vardhan</surname>
          </string-name>
          .
          <article-title>"Analysis of crop yield prediction using data mining techniques."</article-title>
          <source>International Journal of research in engineering and technology 4</source>
          , no.
          <issue>1</issue>
          (
          <year>2015</year>
          ):
          <fpage>47</fpage>
          -
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>