=Paper=
{{Paper
|id=Vol-3126/paper53
|storemode=property
|title=Evaluation and comparison of the processes in the frozen vegetable production using machine learning methods
|pdfUrl=https://ceur-ws.org/Vol-3126/paper53.pdf
|volume=Vol-3126
|authors=Piotr Milczarski
}}
==Evaluation and comparison of the processes in the frozen vegetable production using machine learning methods==
Evaluation and Comparison of the Processes in the Frozen
Vegetable Production Using Machine Learning Methods
Piotr Milczarski
Faculty of Physics and Applied Informatics, University of Lodz, Pomorska str. 149/153, Lodz, Poland
Abstract
In the paper, the study of the carbon footprint (CF) assessment in the frozen vegetable
production processes is shown in order to receive low-carbon products. Three methods of
clusterization have been chosen for the production assessment. The results of clusterization are
evaluated by five classification methods: k-Nearest Neighbors, Multilayer Perceptron, C4.5,
Random Forrest and Support Vector Machines with a radial basis kernel function. In the chosen
model with five clusters, the best clusterization methods are k-means followed by Canopy.
Keywords 1
Carbon Footprint; clusterization; Canopy, k-means, Expectation-Maximization; k-Nearest
Neighbors; Multilayer Perceptron; C4.5; Random Forrest; Support Vector Machines
1. Introduction The adoption of an action plan for the
reduction of gaseous emissions by EU countries
in 2014 requires the reduction of GHG emissions
Greenhouse gas emissions from human
by 30% by 2030, compared to the level in 2005
activities have been a major contributor to global
[6]. The methods of calculating the carbon
warming since the mid-twentieth century.
footprint are most often based on well-known
Agriculture and land-use change contributed to
standards. Among them, the most used are:
17% of global anthropogenic greenhouse gas
emissions in 2010 [1]. By 2050 the population ISO14040: 2006 [7] – Environmental
will be 9 billion people [2] to ensure supplying of management-life cycle assessment: principles
food, agricultural production should be increased and framework,
by 60%. Climate change can affect food ISO14064-1: 2018 [8] – Greenhouse
availability; for example, an increase in gases - Part 1: Specification with guidance at
temperature, a change in the structure of rainfall the organization level for quantification and
or extreme weather events may result in a reporting of greenhouse gas emissions and
reduction in agricultural productivity [3, 4]. removals,
Therefore, its main challenge has become to ISO/TS 14067:2018 [9] – Greenhouse
mitigate the threats that climate change poses to gases - Carbon footprint of products -
food security. Requirements and guidelines for
In response to the emerging threats of climate quantification,
change, numerous programs, both global and PAS2050 [10] – Specification for the
regional, have been developed, the purpose of assessment of the life cycle greenhouse gas
which is to slow down the growth rate of GHG emissions of goods and services.
concentration [5]. Achieving climate policy goals Once the carbon footprint has been calculated,
requires continuous monitoring of emissions and its detailed data helps to identify weaknesses, i.e.
verification of the effectiveness of solutions for high-emission areas, that can be eliminated or
the development of a low-emission economy. improved. Thus, the carbon footprint is an
indicator of sustainable development
ISIT 2021: II International Scientific and Practical Conference
«Intellectual Systems and Information Technologies», September
13–19, 2021, Odesa, Ukraine
EMAIL: piotr.milczarski@uni.lodz.pl (A. 1);
ORCID: 0000-0002-0095-6796 (A. 1);
©️ 2021 Copyright for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
2. Carbon footprint assessment using 3. Carbon footprint assessment in
Life Cycle Assessment (LCA) CFOOD project
method
In the case of the CFOOD project, we focus on
Carbon footprint calculation is used as a tool the optimization of the frozen food production
for assessing greenhouse gas emissions, helping process, so we consider a segment of the product
to manage and reduce them. The carbon footprint life cycle from the moment of raw material
is typically calculated using carbon emission delivery to the shipment of the finished frozen
factors and activity data that can be assessed food to the recipient. The production process can
through a Life Cycle Assessment (LCA). The be divided into several smaller stages:
carbon footprint analysis according to the LCA S1 – initial cooling of the raw materials
methodology is carried out by identifying before the processing;
potential environmental threats, usually S2 – the raw material preparation for the
throughout the entire life cycle of a product, i.e. production;
from the extraction and processing of raw S3 – raw material pre-processing on the
materials, their transport, through main production line;
production, distribution and use, to waste S4 – product freezing in the cold tunnel;
management [11]. However, in agricultural S5 – product preparation to a coldstore.
production, the emissions directly related to Each of the process stages is connected to
energy consumption are not dominant [12]. A electric meter units. Each production stage has
large part of GHG emissions on farms is gas also a preparation phase that is measured
losses from farmland and livestock. While separately, e.g. S1 has a preparation phase that is
calculating the carbon footprint with the use of
denoted pS1, etc.
agricultural emission models according to the
IPCC reports, all emission sources are taken into In the research section, we have tested several
account, both those related to energy carriers and clusterization methods and choose three: Canopy,
processes taking place in the agricultural k-Means (KM) and Expectation-Maximization
environment. (EM) [17][18]. We have tested several options
LCA is a widely used approach to assess the with the cluster numbers and chosen five clusters
actual environmental impact of a product from its for each method that should represent according
production and use [11] [12] [13]. The standards to our experience some real-time situations that
for assessing the product carbon footprint in LCA occur during the production and their accounting
are mainly PAS 2050 [10] and ISO / TS 14067 [9]. systems:
In the case of the CFOOD project, that is - Optimal production – the product has the
presented in the paper, the focus is on the temperature from -25oC till -18oC at the end
optimization of the frozen food production of the line;
process, so we consider a segment of the product - Close to optimal – during the high season
life cycle from the moment of raw material through-output should be higher, hence the
delivery to the shipment of the finished frozen energy consumption should be lower, the
food to the recipient product temperature is allowed to be from the
According to the adopted LCA methodology, range -6oC and -18oC.
the carbon footprint of a product consists of - Wrong accounting of some parameters e.g.
carbon footprints generated at the following operators mistakes resulting in too high or too
stages of its production. Hence the total CF for a low results e.g. the through-output.
given product or its unit value can be expressed - Malfunction of the energy meters. It is a
by the following formula [14][15][16]: different situation from the above one and
r
CF CF
i a
i (1)
might result in random results.
The clusterization model with five clusters
where: i is each of the stages of the product life should have at least 60 processes. After a year of
cycle, i = a, m, t, u, and r, relate to the extraction the process measurement, till June 2021, we have
of raw materials, production, transport, use as well collected 152 results only for the frozen onion
as the recycling and disposal stage, respectively. production and 75 for the spinach. The other
vegetables have less than 50 cases. Nonetheless,
the other production e.g. broccoli and cauliflower - k-Means (KM) with Euclidean distance, max-
should also be optimized. That is why in the candidates = 100, periodic-pruning = 10000,
current work, the results of clusterization of 35 min-density = 2.0, T1 = -1.25 and T2 = -1.0.
broccoli processes and 42 cauliflower ones are - Expectation–Maximization (EM) with max-
presented in the current paper. candidates = 100, “minimum improvement in
log likelihood” = 1E-5, “minimum
In the previous work [15][16] to assess the
improvement in cross-validated log
onion and spinach production processes we have
likelihood” = 1E-6, and “minimum allowable
prepared the set of verified data and to assess the
standard deviation” = 1E-6.
trustworthiness of the production data we have
compared the results of processes classification
Table 2
using 5 classifiers: k-Nearest Neighbors,
Multilayer Perceptron [17], C4.5, Random Forrest Canopy clusterization of broccoli production
and Support Vector Machines with a radial basis Broccoli Cluster Canopy
kernel function [17]. In the current paper, the Attribute 0 1 2 3 4
focus is on unsupervised methods i.e. pS1 0.09 0.39 0.08 0.13 0.13
clusterization [17] into the broccoli and S1 2.85 1.53 0.13 6.92 0.71
cauliflower processes. S2 0.11 0.03 0.10 0.11 0.05
pS3 0.02 0.06 0.05 0.00 0.07
S3 0.44 1.25 0.63 0.14 0.63
Table 1 pS4 1.59 1.75 5.22 0.14 5.36
K-means clusterization of broccoli production, S4 16.85 58.77 45.3 10.65 43.53
the units for stages i-th stage pS1, S1 etc. are in pS5 0.01 0.24 0.00 0.00 0.22
S5 0.21 1.74 0.00 0.21 0.42
kWh/ton, for pt in ton/h, for et in kWh/h
pt 2.00 1.35 1.55 1.90 1.92
Broccoli Clusters K-Means et 42.19 85.69 82.9 33.65 100.1
Attribute 0 1 2 3 4 instances 16 3 3 8 5
pS1 0.08 0.32 0.04 4.19 0.09
S1 1.34 1.35 1.51 4.25 2.08 Table 3
S2 0.16 0.03 0.23 0.09 0.08
EM clusterization of broccoli production
pS3 0.06 0.05 0.03 0.11 0.06
S3 0.91 1.14 0.70 0.21 1.38 Broccoli Cluster EM
pS4 7.68 2.29 0.12 6.54 0.25 Attribute 0 1 2 3 4
S4 49.10 55.69 3.07 13.19 6.40 pS1 0.09 0.33 0.02 89.74 0.25
pS5 0.01 0.18 0.00 0.18 0.01 S1 3.17 13.28 1.16 6.92 1.46
S5 0.18 1.51 0.03 0.24 0.17 S2 0.08 0.11 0.23 0.14 0.06
pt 1.56 1.46 1.80 2.11 2.12 pS3 0.01 0.02 0.04 2.16 0.06
et 98.67 91.01 9.91 57.77 20.32 S3 0.27 0.55 0.77 0.14 1.01
instances 4 4 3 22 2 pS4 0.30 1.86 4.55 129.4 3.27
S4 8.60 38.08 20.92 11.29 52.48
In Tables 1-3 and 4-6 there are clusterization pS5 0.01 0.05 0.00 3.61 0.14
results of the broccoli and cauliflower production S5 0.18 0.68 0.02 0.27 1.02
processes. The units for stages i-th stage pS1, S1 pt 2.13 2.07 1.71 1.96 1.55
et 26.84 104.9 44.61 465.0 95.07
etc. are in kWh/ton, for pt in ton/h, for et in
instances 19 2 5 1 8
kWh/h. The results are achieved using the chosen
clusterization methods with five clusters:
- Canopy: max-candidates = 100; periodic- Figures 1 and 2 show the energy consumption
pruning = 10000 ; min-density = 2.0; T2 during the production on the energy meters of the
radius = 0.804 and T1 radius = 1.005 chosen stages S1, S2, S3 and S4 for the chosen
broccoli process with ID 373 and the cauliflower
process with ID 365.
Figure 1: Example of energy consumption for the broccoli production, process ID 373; the colors of
the stages: S1 – brown, S2 – green, S3- light blue, S4 - dark blue.
Figure 2: Example of energy consumption for the cauliflower production, process ID 365; the colors of
the stages: S1 – brown, S2 – green, S3- light blue, S4 - dark blue.
Table 4 Table 5
K-means clusterization of cauliflower production Canopy clusterization of cauliflower production
Cauliflower Clusters K-Means Cauliflower Cluster Canopy
Attribute 0 1 2 3 4 Attribute 0 1 2 3 4
pS1 0.52 0.18 5.46 6.97 519.2 pS1 5.23 0.50 519.2 0.70 0.10
S1 24.27 2.48 7.08 1.00 2.28 S1 4.52 24.42 2.28 14.62 7.16
S2 1.13 0.10 0.14 0.06 0.05 S2 0.11 1.60 0.05 0.35 0.08
pS3 0.17 0.06 0.16 3.20 157.7 pS3 1.35 0.09 157.7 0.01 0.01
S3 8.41 0.97 1.71 0.55 1.21 S3 1.34 8.24 1.21 0.77 2.72
pS4 0.43 5.22 3.67 22.58 678.1 pS4 11.26 0.36 678.1 0.11 0.18
S4 28.30 57.14 17.50 3.14 5.55 S4 17.43 26.35 5.55 4.30 11.93
pS5 0.02 0.22 0.14 0.84 48.59 pS5 0.42 0.01 48.59 0.00 0.01
S5 0.69 1.31 0.33 0.06 0.24 S5 0.37 0.55 0.24 0.13 0.58
pt 1.86 1.37 2.07 1.64 2.22 pt 1.80 1.87 2.22 1.67 1.81
et 127.0 92.66 79.17 81.15 3332 et 83.16 123.6 3332 36.75 44.63
instances 3 5 17 15 2 instances 27 2 2 3 8
4. Evaluation of the clusterization Table 7
Evaluation of the broccoli clusterization by the
In the discussion presented in Tables 1-6 and, chosen classifiers
the optimal clusters have been highlighted. All Broccoli evaluation results [%]
Classifier
values for the stages and their preprocessing phase Canopy KM EM
are in kWh/ton, the production through output (pt) 3NN 85.7 97.1 97.1
in [ton/h]. K-means and EM seem to provide the C4.5 94.3 100 97.1
best assessment of the processes because it’s the MLP 97.1 94.3 97.1
best cluster that has the lowest energy
RF 100 100 100
consumption from the three optimal clusters for
SVM 100 100 100
each clusterization.
Table 6 Table 8
EM clusterization of cauliflower production Evaluation of the cauliflower clusterization by the
Cauloflower Cluster EM chosen classifiers
Attribute 0 1 2 3 4 Cauliflower evaluation results
pS1 3.44 0.50 0.17 34.90 519.2 Classifier [%]
S1 4.13 23.95 2.13 0.06 2.28
S2 0.10 0.94 0.10 0.00 0.05 Canopy KM EM
pS3 0.11 0.13 0.08 16.03 157.7 3NN 90.5 90.5 85.7
S3 1.31 6.59 0.96 0.00 1.21 C4.5 95.2 97.6 97.6
pS4 2.13 0.34 5.53 113.2 678.1 MLP 92.9 81.0 92.9
S4 11.01 22.59 54.4 0.28 5.55 RF 100 100 100
pS5 0.09 0.01 0.19 4.24 48.59 SVM 100 100 100
S5 0.23 0.58 1.11 0.01 0.24
pt 1.89 1.94 1.47 1.55 2.22
et 48.6 112.4 94.3 363.0 3332 5. Conclusions
instances 27 4 6 3 2
In the paper, three clusterization methods have
been shown that allow us to assess the processes
To assess and to choose the clusterization
and their impact on energy consumption and
method we have used five machine learning
hence, the carbon footprint. We have shown that
methods as in our previous work [11][12]. All the
all the clustering methods point out the processes
clusterization results were assessed by the
that are proper from the manufacturing point of
classification methods with the same parameters.
view. In the paper, the results for the broccoli and
In Tab. 5 there are classification results of the
cauliflower production taking into account 35 and
production processes using the following
42 corresponding processes respectively have
classifiers:
been shown. Currently, we collect new processes
- 3NN (kNN) 3-Nearest Neighbors; for the other vegetable products. The will be
- Multilayer Perceptron (MLP) with a hidden analyzed using the clustering methods shown
layer with 16 nodes for both productions with a above
learning rate equal to 0.79 and momentum
The k-means classifier is fast and simple, it has
equal to 0.39 [13];
significant disadvantages because it is sensitive to
- binary tree C4.5 with a confidence factor equal
emissions that distort the average value. Although
to 0.25, with a minimum number of instances
it gives EM the best results in the assessment of
per leaf equal 2;
the whole production it is planned to use k-SVD
- Random Forrest (RF) with the bag size percent
and fuzzy k- means methods in future work.
equal to 100, with maximum depth unlimited,
number of execution slots equal to 1 and 100
iterations; 6. Acknowledgements
- Support Vector Machine (SVM) with a radial
basis function (RBF) given by the Eq. (2): The paper is co-financed by the Polish
National Center for Research and Development,
K(x,y) = exp(-0.05*(x-y)2) (2)
grant CFOOD number life cycle greenhouse gas emissions of goods
BIOSTRATEG3/343817/17/NCBR/2018. and services. British Standards Institution,
2011.
7. References [11] M.A. Renouf, C. Renaud-Gentie, A. Perrin,
C. Kanyarushoki, F. Jourjon, “Effectiveness
criteria for customised agricultural life cycle
[1] O. Edenhofer, R. Pichs-Madruga, Y. Sokona, assessment tools”, J. Clean. Prod. 179, 2018,
E. Farahani, S. Kadner, K. Kadner, A. 246–254
Seyboth, I. Adler, S. Baum, G. Myhre, et al. [12] D. Perez-Neira, A. Grollmus-Venegas,
“Climate Change 2014: Mitigation of “Life-cycle energy assessment and carbon
Climate Change” Working Group III footprint of peri-urban horticulture. A
Contribution to the IPCC Fifth Assessment comparative case study of local food systems
Report, Cambridge University Press:
in Spain”, Landscape and Urban Planning
Cambridge, UK, 2015.
172, 2018, 60-68
[2] Food and Agriculture Organization of the [13] A. Nabavi-Pelesaraei, S. Rafiee, S.S.
United Nations (FAO). Regional Strategy for Mohtasebi, H. Hosseinzadeh-Bandbafha, K.
Sustainable Hybrid Rice Development in Chau, “Energy consumption enhancement
Asia, Food and Agriculture Organization of and environmental life cycle assessment in
the United Nations Regional Office for Asia paddy production using optimization
and the Pacific: Bangkok, Thailand, 2014. techniques”, J. Clean. Prod. 162, 2017, 571-
[3] D.B. Lobell, W. Schlenker, J. Costa-Roberts,
586
“Climate trends and global crop production [14] P. Milczarski, A. Hłobaż, P. Maślanka, B.
since 1980”, Science 2011, 333, 616–620. Zieliński, Z. Stawska, P.Kosiński, "Carbon
[4] R.Y.M. Kangalawe, C.G. Mungongo, A.G. footprint calculation and optimization
Mwakaje, E. Kalumanga, P.Z. Yanda,
approach for CFOOD project", CEUR
“Climate change and variability impacts on
Workshop Proceedings 2683 (2019) 30-34
agricultural production and livelihood [15] P. Milczarski, B. Zieliński, Z. Stawska, A.
systems in Western Tanzania”. Clim. Dev.
Hłobaż, P. Maślanka, P. Kosiński, "Machine
2017, 9, 202–216. Learning Application in Energy
[5] ECE Strategies and policies for air pollution Consumption Calculation and Assessment in
abatement. United Nations, New York and Food Processing Industry", ICAISC (2)
Geneva, 2007. (2020), Springer LNAI 12416, 369-379.
[6] European Council Conclusions 2014. 2030 [16] Z. Stawska, P. Milczarski, et al., ”The carbon
Climate and energy policy framework. footprint methodology in CFOOD project.”
Conclusions – 23/24 October 2014, EUCO
International Journal of Electronics and
169/14,
Telecommunications, 2020, 66(4), 781–786
http://www.consilium.europa.eu/uedocs/cms [17] P. Harrington, “Machine Learning in
_data/docs/pressdata/en/ec/145397.pdf Action.” Manning Publ. 2012.
[7] ISO14040 - Environmental management-life [18] A.P Dempster, N.M. Laird, D.B. Rubin,
cycle assessment: principles and framework. "Maximum Likelihood from Incomplete
International Organization for Data via the EM Algorithm". Journal of the
Standardization, Geneva, 2006. Royal Statistical Society, Series B. 39 (1),
[8] ISO14064-1 - Greenhouse gases - Part 1: 1977, 1–38
Specification with guidance at the
organization level for quantification and
reporting of greenhouse gas emissions and
removals. International Organization for
Standardization, Geneva, 2018.
[9] ISO/TS 14067 - Greenhouse gases - Carbon
footprint of products - Requirements and
guidelines for quantification. International
Organization for Standardization, Geneva,
2018.
[10] PAS 2050 (2011) “The Guide to PAS2050-
2011, Specification for the assessment of the