=Paper=
{{Paper
|id=Vol-2030/HAICTA_2017_paper14
|storemode=property
|title=Machine Learning Based Computational Analysis Method for Cattle Lameness Prediction
|pdfUrl=https://ceur-ws.org/Vol-2030/HAICTA_2017_paper14.pdf
|volume=Vol-2030
|authors=Konstantinos Liakos,Serafeim Moustakidis,Georgia Tsiotra,Thomas Bartzanas,Dionysis Bochtis,Constantinos Parisses
|dblpUrl=https://dblp.org/rec/conf/haicta/LiakosMTBBP17
}}
==Machine Learning Based Computational Analysis Method for Cattle Lameness Prediction==
<pdf width="1500px">https://ceur-ws.org/Vol-2030/HAICTA_2017_paper14.pdf</pdf>
<pre>
     Machine Learning Based Computational Analysis
         Method for Cattle Lameness Prediction

Konstantinos Liakos1, Serafeim Moustakidis1, Georgia Tsiotra1, Thomas Bartzanas1,
                    Dionysis Bochtis1, Constantinos Parisses2
 1
 Institute for Bio-economy and Agri-technology (IBO), Centre for Research & Technology-
 Hellas (CERTH), 6th km Charilaou - Thermi Rd, GR 57001 Thermi, Thessaloniki, Greece,
                               e-mail: kliakos@ireteth.certh.gr
          2
            Department of Electrical Engineering, Technological Education Institute
                           of Western Macedonia, Kozani, Greece


       Abstract. A significant problem that the systematic cattle farming is facing
       and the science of Livestock Precision Farming is trying to solve, is the
       identification of lameness in cattle. The aim of this research is to present a
       novel integrated computational analysis for lameness prediction based on
       machine learning methods. The new algorithm was tested on data sets of
       healthy and unhealthy cattle. The new computational analysis uses four
       features: «steps per day» (dimensionless), «overall walking per day» (m),
       «lying per day» (min) and «eating per day» (min). The aim of these four
       features was to help the algorithm to separate the samples, in the best possible
       way. The result which was obtained was encouraging since the algorithm can
       identify equally well the positive samples (healthy cattle) and the negative
       samples (cattle suffering from lameness).

       Keywords: Lameness, Cattle, Random Forest, ANN, LIBSVM.


1 Introduction

Every year, computer science shows great progress in hardware as well as in
software level but mainly in the field of machine learning. Thanks to this rapid
development of computer systems and machine learning algorithms, sectors from
other scientific domains have evolved. In recent years, significant studies have been
developed on Precision Livestock based on machine learning method solving real-
life problems in an automated manner.
    Machine learning has been applied mainly to issues relating to the science of
Precision Agriculture. For example, the machine learning applied for the exact
calculation of soil temperature (Nahvi, 2016). Another application of the machine
learning deals with the calculation of the soil drying (Coopersmith, 2014) or for the
correct prediction of the dew point on a daily basis (Mohammadi, 2015). Machine
learning is applied even for the accurate prediction in the production of wheat
(Pantazi, 2016) and for the correct prediction of the evapotranspiration (Patil, et al.
2016).


                                            128
   In the field of Precision Livestock Farming, machine learning is rarely applied and
mainly concerns the automated individual monitoring of the livestock. Some studies
where machine learning was applied in Precision Livestock Farming relate to the
behavior recognition in cattle. Specifically, in research (Dutta, 2015) machine
learning is applied to data which is collected from 3-axis accelerometers and
magnetometers to distinguish when the cattle search for food, graze, rest and walk.
Other studies which machine learning is applied relate to the right identification of
cattle with biometric characteristics. For example in the study (Gaber, 2016) with the
help of machine learning methods, they try to identify the characteristics of the head
of a bovine by using biometrics features. Also machine learning is applied to
biological studies on Precision Livestock Farming. For example, in the research
(Meher, 2016) their purpose is the correct identification of coding regions from non-
coding regions for the cattle, with the help of two features, the structure of the codons
and the mutations of the methylation. However, one of the most important issues that
concern the Precision Livestock Farming is the creation of automated systems which
relate to the welfare and health of animals.
   Lameness is one of the most important issues with regard to the health of farm
animals. Problems created in production derived from lameness are catastrophic for
the farmer. A decrease in profit was noted (Bruijnis, et al. 2010), due to the decrease
in milk and meat production and cost increase, due to the healthcare of the cattle. The
diseases which are associated with the lameness, costs 66 € per cattle with 32% of
that given for the healthcare (Bruijnis, et al. 2010). It is important to detect the
lameness in time and with reliability (Booth, 2004, Holzhauer, 2004 and Tasch &
Rajkondawar, 2004), in order to reduce the cost but also to ensure the health of the
animal. It has been observed that animals which suffer from lameness presents
various symptoms, such as difficulty in walking (Walker, 2008), they lie down more
compared to healthy animals (Walker, 2008, Ito, 2010 and Chapinal, 2009), stand
less (Walker, 2008) and graze less (Miguel-Pacheco, 2014). Until now many studies
have been concerned on correct prediction of lameness in cattle. Nevertheless, the
existing methods are unclear and unreliable (Schlageter-Tello, 2014) and mostly
those which try to approach lameness with computational analysis.
   The methods which try to predict the lameness in cattle vary from study to study.
Some studies approach the lameness with optical technologies. For example, (Song,
2008) tried to observe the lameness with the usage of high resolution pictures and
videos or in research (Viazzi, 2014) they try to identify the lameness with the usage
of 2 dimensional and 3 dimensional cameras. Another way of dealing the lameness is
by using sensors, as it reported in (Pastell, 2008), in which with the usage of force
sensors authors tried to record and distinguish the cattle with lameness.
   The purpose of this study is the creation of an integrated computational analysis
based on machine learning methods with the aim of distinguishing correctly the
healthy cattle from cattle which suffer from lameness.


                                          129
2 Methods

   In this section a new integrated computational analysis is presented, based on
machine learning methods. Initially, the algorithm consists of two computational
models, the (LP1, Table 1) and (LP2, Table 2). Next, the algorithm uses the model
which returns the highest results. For the training and the prediction, the two models
was tested in three machine learning methods, namely, Artificial Neural Networks,
Random Forest, and Library for Support Vector Machine (LIBSVM) to determine
which will be the final model, which will return the highest results. For the SVM
machine learning method we used an innovative parallel programming model, the
GPU-LIBSVM (Athanasopoulos, et al. 2011).
   It’s the first time in which the GPU-LIBSVM model is used for the computational
analysis of lameness in cattle. The GPU-LIBSVM model is applied for the training
and the prediction of the two computational models (LP1) and (LP2). This innovative
SVM machine-learning model enables more computational models with a lot of
features to be created and to be tested 30 times faster.
   The models (LP1) and (LP2) are different in the number of features they use to
distinguish the samples. The purpose was to ascertain how affected the two
computational models from their features and what features help the computational
models to distinguish with bigger accuracy the healthy cattle from cattle which suffer
from lameness.

Table 1. Computational Model LP1.

                         Computational Model LP1
                         Feature 1         Steps per day
                         Feature 2         Walking per day (m)
                         Feature 3         Lying per day (min)

Table 2. Computational Model LP2.

                         Computational Model LP2
                         Feature 1          Steps per day
                         Feature 2          Walking per day (m)
                         Feature 3          Lying per day (min)
                         Feature 4          Eating per day (min)

Table 3. Example from a positive and negative sample for a set with four features.

                          Steps per day      Walking        Lying per      Eating per
    State of cattle
                          (dimensionless)    per day (m)    day (min)      day (min)
    Healthy               2900               3700           660            178
    With problem in
                          600                2350           830            168
    hooves

   For the training and the prediction of the two models, two sets are used: one set
for training and one set for prediction. The two sets are small in samples because


                                            130
based on an assumption according to the method (Frondelius, 2015). This has as a
result the test set to return unusually high values. Main purpose in future is to create a
large dataset based on the above-mentioned four features and to observe how
effective these are on a large scale. The training and prediction sets are presented in
Table 4 and Table 5.

Table 4. Training set 1.1.

                               Training set 1.1
                               Positive 6 healthy cattle
                               Negative 6 with lameness cattle

Table 5. Prediction set 1.2.

                               Prediction set 1.2
                               Positive 2 healthy cattle
                               Negative 2 with lameness cattle

   The training set 1.1 and the prediction set 1.2 are used for the training and the
prediction of the computational models LP1 and LP2. The features from the two
models was converted with scale method and then provided for the training and
prediction processes at the three machine learning methods. For the Library for
Support Vector Machine method the best option returned from SVM type: One-Class
and Kernel type: Linear.


3 Results

   The final results of the two computational models LP1 and LP2, are presented for
the sets of training and prediction, in order to observe what computational model and
which machine learning method could predict with highest accuracy the lameness in
cattle.
   The two computational models were created in the programming languages Perl,
Python and R.


3.1 LP1 Computational Model

The first table presents the threshold used for the three machine learning methods of
each computational model and also the results which were returned for the specific
threshold such as: True Positive, False Positive, True Negative, False Negative,
Sensitivity, Specificity, Precision, Recall, Accuracy and AUC (Area Under Curve).


                                              131
 Table 6. Threshold of machine learning methods for the computational model LP1 and the
 training set 1.1.

LP1 computational model & training set 1.1
M.L
           Threshold    TP      FP     TN    FN   Sensitivity   Specificity   Precision   Recall   Accuracy   AUC
Method
ANN        0.5          5       1      5     1    0.83          0.83          0.83        0.83     0.83       0.83
RF         0.5          6       0      6     0    1.00          1.00          1.00        1.00     1.00       1.00
LIBSVM     0.4          5       2      4     1    0.83          0.66          0.71        0.83     0.75       0.75


    The pie chart presents the average prediction score produced for each set from the
 three machine learning methods.


 Fig. 1. The average prediction score which is returned from the machine learning methods for
 the computational model LP1 and the training set 1.1.

 Table 7. Threshold of machine learning methods for the computational model LP1 and the
 prediction set 1.2.

LP1 computational model & prediction set 1.2
M.L
           Threshold    TP     FP     TN     FN   Sensitivity   Specificity   Precision   Recall   Accuracy   AUC
Method
ANN        0.5          2      0      2      0    1.00          1.00          1.00        1.00     1.00       1.00
RF         0.5          2      0      2      0    1.00          1.00          1.00        1.00     1.00       1.00
LIBSVM     0.4          2      0      2      0    1.00          1.00          1.00        1.00     1.00       1.00


 Fig. 2. The average prediction score which is returned from the machine learning methods for
 the computational model LP1 and the prediction set 1.2.


                                                         132
Fig. 3. Sensitivity 1-Specificity plot of the machine learning methods for the computational
model LP1 and for the training set 1.1 & prediction set 1.2.


Fig. 4. Box plots with the 3 features from training set 1.1 and prediction set 1.2.


                                               133
    From Figure 3 and Table 6, it is observed that the machine learning method
 Random Forest can distinguish the training set (1.1, Table 4) with the highest score,
 with Accuracy=100%. The prediction set (1.2, Table 5) can be distinguished equally
 well from all the three machine-learning methods (Figure 3 and Table 7). Another
 positive aspect is that the Random Forest machine learning method can identify with
 significant difference the positive samples from negative samples for both sets. That
 result is obtained from the average prediction score of positive samples and from the
 average prediction score of negative samples (Figure 1 and Figure 2). In (Figure 4),
 the differences between the healthy and infested cattle are presented in steps, walking
 and in lying. Also from Figure 4, it is revealed that from the three features «steps per
 day» (dimensionless), «overall walking per day» (m), «lying per day» (min), the
 most significant feature is the «lying per day» (min), because it has the bigger
 difference in concentration between healthy and infested cattle and therefore, it
 supports the computational model LP1 to distinguish with bigger accuracy the
 samples.


 3.2 LP2 Computational Model

 In this section, the results for the second computational model (LP2, Table 2) and the
 training set (1.1, Table 4) & the prediction set (1.2, Table 5) are presented.

 Table 8. Threshold of the machine learning methods for the computational model LP2 and the
 training set 1.1.

LP2 computational model & training set 1.1
M.L
           Threshold    TP      FP     TN    FN   Sensitivity   Specificity   Precision   Recall   Accuracy   AUC
Method
ANN            0.5      6       0      6     0    1.00          1.00          1.00        1.00     1.00       1.00
RF             0.5      6       0      6     0    1.00          1.00          1.00        1.00     1.00       1.00
LIBSVM         0.5      6       0      6     0    1.00          1.00          1.00        1.00     1.00       1.00


 Fig. 5. The average prediction score which is returned from the machine learning methods for
 the computational model LP2 and the training set 1.1.

 Table 9. Threshold of the machine learning methods for the computational model LP2 and the
 prediction set 1.2.

LP2 computational model & prediction set 1.2
M.L
           Threshold    TP     FP     TN     FN   Sensitivity   Specificity   Precision   Recall   Accuracy   AUC
Method
ANN        0.5          2      0      2      0    1.00          1.00          1.00        1.00     1.00       1.00
RF         0.5          2      0      2      0    1.00          1.00          1.00        1.00     1.00       1.00
LIBSVM     0.5          2      0      2      0    1.00          1.00          1.00        1.00     1.00       1.00


                                                         134
Fig. 6. The average prediction score which is returned from the machine learning methods for
the computational model LP2 and the prediction set 1.2.


Fig. 7. Sensitivity 1-Specificity diagram of the machine learning methods for the
computational model LP2 and for the training set 1.1 & prediction set 1.2.


                                           135
Fig. 8. Box plots with the 4 features from training set 1.1 and prediction set 1.2.

   The results listed at Table 8 and Table 9 and depicted at Figure 7 are considered as
optimistic. The reason is that the three machine learning methods can distinguish
equally well the training set (1.1, Table 4) and the prediction set (1.2, Table 5), for
the computational model (LP2, Table 2). The most significant conclusion which is
obtained from these results and from Figure 8 is that the fourth feature, «eating per
day» (min), is a crucial feature and helps all three machine learning methods to
distinguish more accurately the positive from the negative samples. The second most
crucial feature is «lying per day» (min). A second positive result that was observed is
increase in the variation of prediction scores between positive and negative samples
from the three machine learning methods (Figure 5 and Figure 6) for both sets,
mainly in ANN and LIBSVM methods, the reason is the fourth feature.
   The conclusions obtained from the observation of the tables and figures for the
two computational models (LP1, Table 1) and (LP2, Table 2) are, that the three
machine learning methods, Artificial Neural Networks, Random Forest, and Library
for Support Vector Machines, can distinguish with remarkable results the positive
from negative samples. As a result, the features which are used from the two
computational models are crucial, mainly the «eating per day» (min) and the «lying
per day» (min) and these features enhance the algorithm such as to distinguish with
bigger accuracy the positive from the negative samples.


                                               136
   The machine learning method which returned the highest results for the two
computational models was the Random Forest.
   From the two computational models which were compared, the best results were
returned from the (LP2, Table 2) model. The reason is that the fourth feature «eating
per day» (min) is used by the model to distinguish the positive from negative
samples. The fourth feature helps significantly the three machine learning methods to
identify the healthy from non-healthy samples, as was obtained from the great
variation for the prediction scores between the positive and negative samples.


4 Conclusion

   The result of this study, was the development of a new integrated, powerful and
reliable computational analysis, which is used for the identification of the lameness
in cattle based on machine learning. Two computational models, Lameness Potential
1 and Lameness Potential 2, were created. The computational model which excelled
was the (LP2) which uses four powerful features, «steps per day» (dimensionless),
«overall walking per day» (m), «lying per day» (min) and «eating per day» (min), to
distinguish the positive from negative samples. The aim of these four features was to
support the algorithm to distinguish the samples in the best way possible. The final
result which was obtained is considered as optimistic, because the algorithm can
distinguish equally well the positive (healthy) and negative (infested) samples, as
indicated from the great variation of the prediction scores between the positive and
negative samples. As a result, the algorithm is able to identify with high accuracy the
healthy cattle from cattle with lameness.


References

1. Athanasopoulos, A. and Dimou, A. (2011) ‘GPU acceleration for support vector
   machines’,      WIAMIS        2011:     12th   …,     (April).     Available   at:
   http://repository.tudelft.nl/view/conferencepapers/uuid:6716875f-5b40-4e7b-
   9f9d-24a85c02ee3b/.
2. Booth, C. J., Warnick, L. D., Grohn, Y. T., Maizon, D. O., Guard, C. L. and
   Janssen, D. (2004) ‘Effect of lameness on culling in dairy cows’, Journal of
   Dairy Science, 87(12), pp. 4115–4122. doi: 10.3168/jds.S0022-0302(04)73554-7.
3. Bruijnis, M. R., Hogeveen, H. and Stassen, E. N. (2010) ‘Assessing economic
   consequences of foot disorders in dairy cattle using a dynamic stochastic
   simulation model’, Journal of Dairy Science, 93(6), pp. 2419–2432.
4. Chapinal, N., de Passillé, a M., Weary, D. M., von Keyserlingk, M. a G. and
   Rushen, J. (2009) ‘Using gait score, walking speed, and lying behavior to detect
   hoof lesions in dairy cows.’, Journal of dairy science, 92(9), pp. 4365–4374. doi:
   10.3168/jds.2009-2115.


                                         137
5. Coopersmith, E. J., Minsker, B. S., Wenzel, C. E. and Gilmore, B. J. (2014)
    ‘Machine learning assessments of soil drying for agricultural planning’,
    Computers and Electronics in Agriculture. Elsevier B.V., 104, pp. 93–104. doi:
    10.1016/j.compag.2014.04.004.
6. Dutta, R., Smith, D., Rawnsley, R., Bishop-Hurley, G., Hills, J., Timms, G. and
    Henry, D. (2015) ‘Dynamic cattle behavioural classification using supervised
    ensemble classifiers’, Computers and Electronics in Agriculture. Elsevier B.V.,
    111, pp. 18–28. doi: 10.1016/j.compag.2014.12.002.
7. Gaber, T., Tharwat, A., Hassanien, A. E. and Snasel, V. (2016) ‘Biometric cattle
    identification approach based on Weber’s Local Descriptor and AdaBoost
    classifier’, Computers and Electronics in Agriculture. Elsevier B.V., 122, pp. 55–
    66. doi: 10.1016/j.compag.2015.12.022.
8. Holzhauer, M., Middelesch, H., Bartels, C. and Frankena, K. (2004) ‘Evaluation
    of a Dutch claw health scoring system in dairy cattle’, in Proceedings of the 13th
    International Symposium and 5th Conference on Lameness in Ruminants.
    Available                                 at:                               email:
    m.holzhauer@gdvdieren.nl\nhttp://search.ebscohost.com/login.aspx?direct=true&
    db=lah&AN=20043084821&site=ehost-live.
9. Ito, K., von Keyserlingk, M. a G., Leblanc, S. J. and Weary, D. M. (2010) ‘Lying
    behavior as an indicator of lameness in dairy cows.’, Journal of dairy science,
    93(8), pp. 3553–3560. doi: 10.3168/jds.2009-2951.
10. Meher, P. K., Sahu, T. K., Rao, A. R. and Wahi, S. D. (2016) ‘Discriminating
    coding from non-coding regions based on codon structure and methylation-
    mediated substitution: An application in rice and cattle’, Computers and
    Electronics in Agriculture. Elsevier B.V., 129, pp. 66–73. doi:
    10.1016/j.compag.2016.09.013.
11. Miguel-Pacheco, G. G., Kaler, J., Remnant, J., Cheyne, L., Abbott, C., French, A.
    P., Pridmore, T. P. and Huxley, J. N. (2014) ‘Behavioural changes in dairy cows
    with lameness in an automatic milking system’, Applied Animal Behaviour
    Science, 150, pp. 1–8. doi: 10.1016/j.applanim.2013.11.003.
12. Mohammadi, K., Shamshirband, S., Motamedi, S., Petković, D., Hashim, R. and
    Gocic, M. (2015) ‘Extreme learning machine based prediction of daily dew point
    temperature’, Computers and Electronics in Agriculture, 117, pp. 214–225. doi:
    10.1016/j.compag.2015.08.008.
13. Nahvi, B., Habibi, J., Mohammadi, K., Shamshirband, S. and Al Razgan, O. S.
    (2016) ‘Using self-adaptive evolutionary algorithm to improve the performance
    of an extreme learning machine for estimating soil temperature’, Computers and
    Electronics in Agriculture. Elsevier B.V., 124, pp. 150–160. doi:
    10.1016/j.compag.2016.03.025.
14. Pantazi, X. E., Moshou, D., Alexandridis, T., Whetton, R. L. and Mouazen, A. M.
    (2016) ‘Wheat yield prediction using machine learning and advanced sensing
    techniques’, Computers and Electronics in Agriculture. Elsevier B.V., 121, pp.
    57–65. doi: 10.1016/j.compag.2015.11.018.


                                         138
15. Pastell, M., Kujala, M., Aisla, A. M., Hautala, M., Poikalainen, V., Praks, J.,
    Veermäe, I. and Ahokas, J. (2008) ‘Detecting cow’s lameness using force
    sensors’, Computers and Electronics in Agriculture, 64(1), pp. 34–38. doi:
    10.1016/j.compag.2008.05.007.
16. Patil, A. P. and Deka, P. C. (2016) ‘An extreme learning machine approach for
    modeling evapotranspiration using extrinsic inputs’, Computers and Electronics
    in     Agriculture.     Elsevier     B.V.,      121,      pp.     385–392.     doi:
    10.1016/j.compag.2016.01.016.
17. Schlageter-Tello, A., Bokkers, E. A. M., Koerkamp, P. W. G. G., Van Hertem,
    T., Viazzi, S., Romanini, C. E. B., Halachmi, I., Bahr, C., Berckmans, D. and
    Lokhorst, K. (2014) ‘Manual and automatic locomotion scoring systems in dairy
    cows: A review’, Preventive Veterinary Medicine, pp. 12–25. doi:
    10.1016/j.prevetmed.2014.06.006.
18. Song, X., Leroy, T., Vranken, E., Maertens, W., Sonck, B. and Berckmans, D.
    (2008) ‘Automatic detection of lameness in dairy cattle-Vision-based trackway
    analysis in cow’s locomotion’, Computers and Electronics in Agriculture, 64(1),
    pp. 39–44. doi: 10.1016/j.compag.2008.05.016.
19. Tasch, U. and Rajkondawar, P. G. (2004) ‘The development of a SoftSeparatorTM
    for a lameness diagnostic system’, Computers and Electronics in Agriculture,
    44(3), pp. 239–245. doi: 10.1016/j.compag.2004.04.001.
20. Viazzi, S., Bahr, C., Van Hertem, T., Schlageter-Tello, A., Romanini, C. E. B.,
    Halachmi, I., Lokhorst, C. and Berckmans, D. (2014) ‘Comparison of a three-
    dimensional and two-dimensional camera system for automated measurement of
    back posture in dairy cows’, Computers and Electronics in Agriculture. Elsevier
    B.V., 100, pp. 139–147. doi: 10.1016/j.compag.2013.11.005.
21. Walker, S. L., Smith, R. F., Routly, J. E., Jones, D. N., Morris, M. J. and Dobson,
    H. (2008) ‘Lameness, Activity Time-Budgets, and Estrus Expression in Dairy
    Cattle’, Journal of Dairy Science, 91(12), pp. 4552–4559. doi: 10.3168/jds.2008-
    1048.


                                         139

</pre>