<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Comparative Analysis of AI Models for Transit Time Prediction in Transportation Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tuan Vu</string-name>
          <email>tuan.vu@tuni.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ali Jedari Heidarzadeh</string-name>
          <email>ali.jedariheidarzadeh@tuni.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Seyedamir Ahmadi</string-name>
          <email>seyedamir.ahmadi@tuni.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wael M. Mohammed</string-name>
          <email>wael.mohammed@tuni.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mika Tuomola</string-name>
          <email>mika.tuomola@honkajokioy.fi</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jose Luis Martinez Lastra</string-name>
          <email>jose.martinezlastra@tuni.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>FAST-Lab, Faculty of Engineering and Natural Sciences, Tampere University</institution>
          ,
          <addr-line>Tampere</addr-line>
          ,
          <country country="FI">Finland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>GMM Finland Oy</institution>
          ,
          <addr-line>Santastentie 197, 38950 Honkajoki</addr-line>
          ,
          <country country="FI">Finland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Time-critical term emphasizes the dominance of time as a factor in systems, operations, processes, and activities. As an example, processing animal by-products is a time-critical process as the material can rapidly degrade and become potentially harmful, hence, not suitable for using as raw material in added value products. In this industry, time estimation and prediction enlarge the margin for making decisions in logistics and processes. This paper presents a comparative review of AI algorithms that predict the readiness of by-product containers at slaughterhouses. The prediction allows the logistic planner to schedule the logistic resources earlier than usual. Consequently, the generated delay in the logistics can be reduced, or even eliminated. The trained models used real collected data from a processing facility for 10 months. Among several AI algorithms, both Decision Tree and Extra Trees regressors provided the lowest error. Then, the voting regressor of these two models provided better results and higher stability.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Artificial Intelligence</kwd>
        <kwd>Transit Time Prediction</kwd>
        <kwd>Data Analytics</kwd>
        <kwd>Feature Engineering 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Organic by-products from animal food production are usually not fit for human consumption.
Nonetheless, they are used as raw materials to produce a wide range of commodities such as animal
food, fertilizers, and biofuels, which in return, increases the sustainability of the entire food chain and
improves environmental impact [1].</p>
      <p>According to EU legislation EU 1069/2009 [2] and EU 142/2011 [3], the quality and category of the
animal by-products depends on two main factors: the contents of the by-products and the age of the
by-product. These two factors affect the types and quality of the produced commodities [4].
Therefore, it is necessary to optimize the logistics activities to maximize the quality of the
byproducts. One of the main challenges in this optimization problem is the narrow time window for the
material before it starts decomposing. This time-critical nature increases the constraints, which in
return, reduces the margin around the optimal solution.</p>
      <p>With a substantial need for finding solutions to improve the environmental impact, the EU
Commission is funding several research projects. One of these projects is titled Optimizing Production
and Logistic Resources in the Time-critical Bio Production Industries in Europe (CLARUS) [5].
CLARUS project, funded by the EU Commission, intends to develop AI solutions for improving and
sustaining the food industry. To validate the project goals, Honkajoki Oy – the leading animal
byproduct processing company in Finland- has been chosen in a use case involving logistics
optimization. One scenario of this use case involves optimizing the selection of time-critical
containers with the highest quality (i.e., category three material [3]) of animal by-products for
transportation from slaughterhouses to Honkajoki’s processing facilities. In this regard, this paper
presents the performance results of several machine-learning models trained on historical logistics
data. The main objective of this research is to provide an empirical comparison of AI algorithms that
provide accurate predictions of the availability of the material. Such a prediction may help the end
users to react early, which in return, enlarges the narrow time window.</p>
      <p>This paper is structured into several sections. The introduction section provides the context of the
paper and the CLARUS project. Section 2 contains insights into by-product logistics optimization and
the goal of forecasting the transit time. The approach developed to tackle the issue is explained in
Section 3, while the preliminary results are presented in Section 4. Lastly, Section 5 describes the
concluding remarks and potential next research steps.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Review of animal by-product logistics</title>
      <p>Logistics is an integral part of supply chains and directly influences expenses [6]. Not only does
optimized logistics improve supply chains’ efficiency and enhance customer satisfaction, but it also
leads companies toward greenness and sustainability. Better logistics means better resource allocation
and less energy consumption and pollution [7]. Moreover, due to their deteriorating nature, food
products differ from other types of material; hence, they require specific needs in their transportation
[8].</p>
      <p>In the case of Honkajoki Oy, logistics is of great significance since the material that is processed
is highly time-critical. Raw material degrades gradually; hence, it should arrive at the factory for
processing as soon as possible. Otherwise, the material quality would decrease to lower categories
that require much more energy to process or be discarded due to the biochemical and chemical
deterioration of the contents, especially in the case of category three animal by-products [3]. In the
use case mentioned in this paper, category three chicken by-products are transported from three
slaughterhouses to the Honkajoki processing facility by fleets of trucks. In a scenario, there can be
multiple filled containers waiting for transport. To this end, an optimization algorithm is designed to
analyze and guide operators in container selection at the slaughterhouse to maximize the quality and
the number of category three containers. This optimization algorithm makes use of a machine
learning model that predicts the container transit time from slaughterhouses to the Honkajoki factory
yard instead of using average values.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Approach</title>
      <sec id="sec-3-1">
        <title>3.1. Data collection</title>
        <p>In the Honkajoki use case, data collection and management are arranged according to the system map
as shown in Figure 1. Honkajoki has collected several years’ worth of logistics and processing data
and stored them on an Amazon AWS server (called Honkajoki Cloud). Historical logistics data used
in the scenario described in this paper are collected from the Honkajoki Electronic Logistic System
(HELOS) hosted within the Honkajoki cloud. The logistics data include container and truck data and
timestamps of all logistics actions.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data modeling</title>
        <p>The dataset used in this research holds information on containers, such as their raw material type,
weight, and the slaughterhouse they have filled. Also, there are logistics-related attributes, e.g., the
truck plate, the timestamp when a container finishes filling, and the timestamp when a truck reaches
the Honkajoki yard.</p>
        <p>Intuitively, the transit time of containers from slaughterhouses to the yard should depend on the
time trucks leave slaughterhouses, which slaughterhouse they depart from, and the plate numbers
identifying trucks. Weights of the containers are not considered since weighing containers takes place
after reaching the Honkajoki yard. Consequently, the corresponding features in the preprocessing
stage were selected and then encoded the string attributes, i.e., slaughterhouse name and plate
number, to numerical values. Afterward, the timestamps were divided into four subparts, namely
week number, weekday number, hour, and minute. Figure 3 shows five random rows of the dataset
for training transit time predictors.
The time difference in seconds between trucks leaving the slaughterhouse and arrival at the yard was
used as labels to train the models. A sample of the transit time of containers from slaughterhouse 1
(SH1) recorded in the historical dataset from 5/12/2022 to 25/12/2022 and their departure timestamps
are plotted in Figure 4.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Prediction model development</title>
        <p>Predictive analytics uses historical data and statistics to analyze trends and predict or forecast the
data. Predictive analytics is done by utilizing statistical algorithms and machine learning algorithms,
allowing organizations to be proactive in situations in the future based on examining predicted data.
Hence, predictive analytics has grown significantly, and multiple machine learning algorithms for
various prediction tasks, including time-series forecasting, have been developed to improve the
overall accuracy of the forecasted data [9] [10].</p>
        <p>Commonly used models in time-series forecasting and predictive analysis, such as deep learning
regression models, e.g., multi-layer perceptron (MLP) and long short-term memory (LSTM) neural
networks, and ensemble learning algorithms, such as Random Forest regressor were considered for
this use case. These models considered for testing are presented in Table 1.</p>
        <p>Table 1
Machine learning algorithms used for testing.</p>
        <p>Machine learning Algorithm</p>
        <p>technique
Ensemble learning</p>
        <sec id="sec-3-3-1">
          <title>Random Forest regressor</title>
          <p>Decision Tree regressor
Gradient Boosting regressor
Extreme Gradient Boosting
regressor
Support Vector Machine
regressor
Extra Trees regressor</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>Deep learning</title>
        </sec>
        <sec id="sec-3-3-3">
          <title>Voting regressor LSTM neural network regressor MLP neural network regressor</title>
          <p>4. Preliminary results
In this paper, ensemble learning models are created using existing algorithms from the scikit-learn
Python library [13], except the Extreme Gradient Boosting regressor, while deep learning models use
components from the TensorFlow library. Several models from the algorithms are created with
multiple parameter configurations, e.g., different numbers of layers and neurons in the case of neural
networks. The configurations yielding the best results based on Mean Absolute Error are shown in
Table 2. Additionally, a linear regression model was used as the baseline for comparison.</p>
          <p>The logistics data is randomly split into training and test sets with a ratio of 4:1, and all the models
are trained and tested with the same dataset. The performance results of all the models are shown in
Table 2.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion</title>
      <p>Processing animal by-products is a time-critical operation that requires minimizing any wasted time.
The process itself can be well-planned. However, the transportation of the material from the
slaughterhouses to the processing facility may generate delays and unplanned changes. Thus,
predicting such disruptions in the logistic operations improves the overall result of the process of the
by-product. As presented in this paper, AI- trained model on historical data can provide the needed
prediction. As observed in this research, the voting regressor combining Decision Tree and Extra
Trees regressors provides the lowest error with better performance in terms of stability. Future work
may include better testing results from other prediction algorithms with different approaches to
feature engineering. According to the requirements of the original scenario of the use case, the results
from the optimization algorithm using data produced by prediction models will also be presented in
the future.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>This research has received funding from the European Union’s Horizon Europe research and
innovation programme under grant agreement No. 101070076. This number corresponds to the
research project CLARUS which is titled as Optimizing Production And Logistic Resources In The
Time-Critical Bio Production Industries In Europe.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <sec id="sec-6-1">
        <title>The author(s) have not employed any Generative AI tools.</title>
        <p>“Regulation (EC) No 1069/2009 of the European Parliament and of the Council of 21
October 2009 laying down health rules as regards animal by-products and derived products
not intended for human consumption,” 14 November 2009. [Online]. Available:
https://eurlex.europa.eu/eli/reg/2009/1069/oj#. [Accessed 4 January 2024].</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>B. O.</given-names>
            <surname>Alao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Falowo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Chulayo</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Muchenje</surname>
          </string-name>
          , “
          <article-title>The Potential of Animal ByProducts in Food Systems: Production, Prospects</article-title>
          and Challenges,” Sustainability, vol.
          <volume>9</volume>
          , no.
          <issue>6</issue>
          , p.
          <fpage>1089</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>“Commission Regulation (EU) No 142/2011 of 25 February 2011 implementing Regulation (EC) No 1069/2009 nd implementing Council Directive</source>
          <volume>97</volume>
          /78/EC,”
          <issue>26</issue>
          <year>February 2011</year>
          . [Online]. Available: https://eur-lex.europa.eu/eli/reg/2011/142/oj.
          <source>[Accessed 4 January</source>
          <year>2024</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>EFPRA</surname>
          </string-name>
          ,
          <article-title>"Rendered products,"</article-title>
          [Online]. Available: https://efpra.eu/rendered-products/.
          <source>[Accessed 2024 January</source>
          <volume>20</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          “CLARUS project,” [Online]. Available: https://clarus-project.
          <source>eu/. [Accessed 15 January</source>
          <year>2024</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>A.</given-names>
            <surname>Fredriksson</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Liljestrand</surname>
          </string-name>
          ,
          <article-title>"Capturing food logistics: a literature review and research agenda,"</article-title>
          <source>International Journal of Logistics Research and Applications</source>
          , vol.
          <volume>18</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>16</fpage>
          -
          <lpage>34</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Christopher</surname>
          </string-name>
          ,
          <article-title>Logistics and supply chain management</article-title>
          ,
          <source>Pearson Uk</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>R. van Hoek</surname>
          </string-name>
          ,
          <article-title>"Postponement and the reconfiguration challenge for food supply chains," Supply Chain Management: an international journal</article-title>
          , vol.
          <volume>4</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>34</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>J. G. D.</given-names>
            <surname>Gooijer</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Hyndman</surname>
          </string-name>
          ,
          <article-title>"25 years of time series forecasting,"</article-title>
          <source>International Journal of Forecasting</source>
          , vol.
          <volume>22</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>443</fpage>
          -
          <lpage>473</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>V.</given-names>
            <surname>Kumar</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Garg</surname>
          </string-name>
          , “
          <article-title>Predictive Analytics: A Review of Trends and Techniques</article-title>
          ,”
          <source>International Journal of Computer Applications</source>
          , vol.
          <volume>182</volume>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>37</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Mahboob</given-names>
            <surname>Elahi</surname>
          </string-name>
          , Samuel Olaiya Afolaranmi, Jose Luis Martinez Lastra and Jose Antonio Perez Garcia, “
          <article-title>A comprehensive literature review of the applications of AI techniques</article-title>
          ,
          <source>” Discover Artificial Intelligence</source>
          , vol.
          <volume>3</volume>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>M. S. N. Richard</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
          </string-name>
          ,
          <article-title>"Modeling of time series using random forests: theoretical developments,"</article-title>
          <source>Electronic Journal of Statistics</source>
          , vol.
          <volume>14</volume>
          , pp.
          <fpage>3644</fpage>
          -
          <lpage>3671</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Duchesnay</surname>
          </string-name>
          ,
          <article-title>"Scikit-learn: Machine Learning in Python,"</article-title>
          <source>Journal of Machine Learning Research</source>
          , vol.
          <volume>12</volume>
          , pp.
          <fpage>2825</fpage>
          --
          <lpage>2830</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>