<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>November</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>The Machine Learning Model Development Lifecycle for Prediction of Electrical Energy Market Volumes</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anatoliy Doroshenko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dmitry Zhora</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oleksii Zhyrenkov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          ,
          <addr-line>Peremohy Ave. 37, Kyiv, 03056</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Software Systems of the National Academy of Sciences of Ukraine</institution>
          ,
          <addr-line>Glushkov Ave. 40, build. 5, Kyiv, 03187</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>2</volume>
      <fpage>0</fpage>
      <lpage>21</lpage>
      <abstract>
        <p>The development of production-ready software solution that employs artificial intelligence is a complex incremental process that requires competence in multiple areas like business domain, programming, statistics, machine learning, containers, networking and deployment. This is a challenge for commercial companies as specialists with diverse qualifications and skills are required. This article highlights the modern state of electrical energy markets in Ukraine, and provides the comparative analysis of regression algorithms used for market volume forecasting. The development process is demonstrated from technical perspective: the dataset is analyzed and augmented with additional information, the optimal set of input parameters is determined, the best machine learning model is trained and serialized to file, the docker image is built with software layer that preloads the serialized model, the docker contained is deployed to Kubernetes cluster for real-time access via REST protocol.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Machine learning</kwd>
        <kwd>regression algorithms</kwd>
        <kwd>model serialization</kwd>
        <kwd>docker container</kwd>
        <kwd>Kubernetes</kwd>
        <kwd>MLOps</kwd>
        <kwd>BentoML</kwd>
        <kwd>Yatai</kwd>
        <kwd>inference platform</kwd>
        <kwd>model deployment</kwd>
        <kwd>electrical energy markets</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Ukraine</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        The usage of machine learning techniques [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] de-facto became a standard for modern systems that
need to provide a forecast, classify data records or implement an associative search. In the long
term, this approach is expected to provide significant economic benefits. One of the popular and
established machine learning libraries is scikit-learn [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The regression algorithms available in this
library are used in current research to build the forecasting model for electrical energy markets.
      </p>
      <p>The increasing complexity of energy markets requires the advanced forecasting models to
predict future trends accurately. These models are crucial for decision making, trading and
planning in energy systems. However, the development of production-ready forecasting solution
involves multiple stages -- each requiring the expertise in programming, mathematics, statistics,
machine learning, containers, deployment systems and potentially in cloud technologies.</p>
      <p>In this article we showcase the end-to-end machine learning workflow for forecasting the trade
volume of electrical energy markets utilizing popular regression algorithm and modern
deployment infrastructure. The workflow employs BentoML orchestration platform for model
packaging and API hosting, Docker for containerization and Kubernetes for autoscaling. The aim is
to provide a practical guide that can be applied in real-world environment, emphasizing ease of
use, modularity, scalability, and interoperability.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Electrical energy markets dataset</title>
      <p>
        Historically, Ukraine was supporting only one market for electrical energy. This market of
longterm
suppliers of electricity. On July 1st, 2019, Ukraine adopted the European model [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] that assumes the
following four markets: bilateral, day-ahead, intraday, and balancing. Despite the electricity market
models in Europe having some differences [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], this was a considerable progress in facilitating the
electricity trading between countries.
      </p>
      <p>The bilateral market can be referenced also as a future or forward market. In Ukraine, as shown
in Figure 1, the total amount of deals is recorded every hour. The markers are organized in a way
to provide integrated access for all market participants and to balance energy price and volatility.
For example, the bilateral market has lowest electricity price, high volume and low volatility. And
vice-versa, the balancing market has highest average price, low volume and high volatility.</p>
      <p>
        The large and complex electrical grids that belong to private or state enterprises still obey to the
laws of physics. The amount of produced electrical energy is equal to amount of consumed energy,
and this amount is exactly represented within the corresponding electricity market volume [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
Also, if the amount of electrical energy traded and transmitted is measured on power substations
then minor losses associated with electrical resistance can be disregarded. As a summary, for
current application domain the following terms are equivalent: energy production, energy
consumption and market volume.
      </p>
      <p>The dataset used for this research matches the time range from July 1st, 2020, to December 31st,
2021. The corresponding market volume dynamics is shown above in figure 2. The volume of each
of the four markets mentioned in the beginning of the section is calculated hourly in megawatts
per hour. For comparison, some European markets record the trading data every 15 minutes.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Usage of additional parameters</title>
      <p>
        It is common for real-world processes that the dynamics of monitored parameters is affected by
other factors that are not available in the original dataset. In particular, the electricity production is
influenced by outside temperature [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. It was decided to add two temperature columns with hourly
data representing the center of Ukraine and Kyiv, the corresponding chart is shown in figure 3.
      </p>
      <p>It is natural to assume periodic patterns in the consumption of electrical energy. Eventually,
they represent the activity of final consumers. In particular, the following cycle types are possible:
daily, weekly, monthly, and yearly. The challenge is to provide the representation of time in a way
that close moments in time would be interpreted as close by machine learning algorithm.</p>
      <p>
        The solution that is convenient from computational perspective [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is to calculate sine and
cosine functions when the corresponding argument represents the phase of the cycle. Obviously,
the close values on the timescale are represented by close values of these periodic functions.
      </p>
      <p>The augmented spreadsheet is shown in Figure 4. Besides the original four columns with market
volume data ten other columns were added. These periodic data series were calculated with the
help of an algorithm written in Python.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Resampling of original data</title>
      <p>As original data record contains the figures representing the current hour it makes sense to add
two types of columns in the dataset: the parameters that represent the history and parameters that
represent the future to be forecasted. It was a heuristic decision to consider up to 24 hours in both
directions. The special naming convention was applied for new parameters. For example, the name
BilateralM1 designates the bilateral market volume that was an hour ago in relation to current
record under consideration. Similarly, the name BilateralP1 indicated the bilateral market volume
in one hour. This additional information is expected improve the forecast accuracy.</p>
      <p>The obtained dataset had 13'129 records. In particular, the first 24 data records and last 24
records were deleted as after resampling they did not contain all necessary parameters. The dataset
was split into training and testing parts, the proportion of 80% to 20% was used in this case. The
random split functionality is provided by scikit-learn library. The datasets were saved into files, so
that different machine learning algorithms considered later are evaluated with equal conditions.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Model comparison metrics</title>
      <p>Three metrics were used in this work to compare input parameter sets and different regression
algorithms: R2 score (determination coefficient), MAPE (mean absolute percentage error) and MAE
(mean absolute error). From the computational perspective each metric measures the discrepancy
between test set and forecasted data for selected output column representing one of electrical
energy trading volumes. The R2 score was used to make a decision, although these metrics were
mainly correlated. The nearest neighbors regression algorithm was used to check the performance
of input parameters, it provides quite competitive results and has limited number of
hyperparameters to tune. Other algorithms available in scikit-learn library were evaluated as a next step.</p>
      <p>The complexity of machine learning algorithm within this library is hidden behind fit and
predict methods that have the same signature across many regression and classification algorithms.
So, it is relatively simple to reuse these methods and to substitute one algorithm instead of another.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Feature selection approaches</title>
      <p>In general, it is possible to select the set of input parameters manually. This workflow assumes
adding or removing one parameter at a time and evaluating the performance of resulting model.
This process is time consuming as it has 2n combinations of parameters, here n represents the total
number of possible inputs. The alternative is to use the automation facilities provided by machine
learning library, in this case by scikit-learn. The following classes are worth mentioning:
GridSearchCV, LassoCV and SelectFromModel. The latter option was used in current research.</p>
      <p>
        The periodic parameters like SinDay were not added to the history as such parameters precisely
indicate the moment in time, the history would provide just redundant information in this case. For
every hour we have 4 parameters representing the electrical energy market volumes and 2
parameters representing the temperature. Thus, overall we have 6 * 24 = 144 input parameters to
select from. The final and locally optimal set of input features obtained with SelectFromModel class
contained 60 entries out of 144 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The R2 score was improved slightly and still was around 96%
for the nearest neighbors algorithm. The positive outcome in this case is that model was somewhat
simplified. The high dimensionality of input space is typically considered a problem. Thus, the
removal of noisy parameters in majority of cases should be a positive step.
      </p>
    </sec>
    <sec id="sec-8">
      <title>7. Prediction error distribution</title>
      <p>The bilateral market was selected as an example to demonstrate the error distribution. Its
market volume prediction error on the test set is shown in figures 5 and 6. For convenience of
representation the test set was sorted by original market volume. The predicted values are shown
on the first chart with dots. The histogram allows to estimate the probability density of error
distribution. It is unusual that obtained prediction error is not quite gaussian. In particular, this is
the case for bilateral and intraday market volumes.</p>
    </sec>
    <sec id="sec-9">
      <title>8. Selection of regression algorithm</title>
      <p>Up to this point, only one algorithm was considered the nearest neighbors regressor. Clearly,
it makes sense to evaluate the performance of other algorithms on the same set of input
parameters. The output parameters were selected for one day ahead forecasting: BilateralP24,
DayAheadP24, IntradayP24 and BalancingP24. The prediction accuracy results are provided in
tables 1-3 below. In particular, the comparison with the following established forecasting
instruments is available: multi-layer perceptron Error! Reference source not found., support v
ector machine Error! Reference source not found., and linear regression Error! Reference
source not found.. It is worth noting that some algorithms do not natively support multi-output
configuration, so it was needed to use the class MultiOutputRegressor to overcome this problem
and cover four output parameters with one machine learning model.</p>
      <p>For all three metrics considered in this work the winner algorithm is HistoramGradientBoosting
regressor. It is one of the fastest methods as it is employing vector quantization technique to
reduce the training set size. Another benefit is that it can natively process the datasets with
missing values. The training phase for this algorithm and current dataset takes about 20 seconds,
the inference or prediction phase takes less than a second. In general, the ensemble algorithms
perform much better for this specific forecasting task.</p>
      <p>Two types of multi-layer perceptron were tried on a dataset. Here QNO stands for
quasiNewton optimizer and SGD stands for stochastic gradient descend. In the first case the synaptic
weights of neural network are determined as analytic solution to optimization task when the
second-order approximation is calculated for the error function. In the second case the minimum
(local or global) is determined with iterative descend process. It appears that for this task the
4layer architecture performs better than 3 or 5-layer.</p>
      <p>Table 2
Mean absolute percentage errors for selected regression algorithms</p>
      <p>Regression Algorithm Bilateral DayAhead
Histogram Gradient Boosting 0.009708 0.035550</p>
      <p>Ada Boost Regressor 0.010436 0.039889
Gradient Boosting Regressor 0.011671 0.041963</p>
      <p>Extra Trees Regressor 0.013403 0.044706
Nearest Neighbors Regressor 0.014842 0.047414</p>
      <p>Random Forest Regressor 0.015383 0.050163</p>
      <p>Support Vector Machine 0.020497 0.065063
Multi-Layer Perceptron (QNO) 0.022011 0.068955
Multi-Layer Perceptron (SGD) 0.023281 0.067661</p>
      <p>Elastic Net Regressor 0.021644 0.067856</p>
      <p>Linear Regression 0.021679 0.067995</p>
      <p>Bayes Ridge Regressor 0.022225 0.069814</p>
    </sec>
    <sec id="sec-10">
      <title>9. Machine learning operations</title>
      <p>The modern landscape of Machine Learning Operations (MLOps) emphasizes the integration of
machine learning models into production environment, ensuring that they deliver consistent and
reliable results. MLOps encompasses a set of practices that aim to automate and improve the
deployment, monitoring, and management of ML models. The key principles include collaboration
between data scientists and operations teams, continuous integration and deployment (CI/CD), and
the use of standardized tools for model serving and monitoring.</p>
      <p>
        Various ML inference tools have emerged to facilitate these processes, including TensorFlow
Serving, MLflow, and BentoML [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Each tool offers unique features tailored to different aspects of
the ML lifecycle. The table 4 highlights core features and helps to understand the use cases from
the architecture perspective.
      </p>
      <p>
        Among these tools, BentoML stands out for its robust architecture designed specifically for
model packaging and deployment. The logic behind BentoML revolves around creating a "Bento"
35
1. Deployment Approach: BentoML uses containerization to simplify the deployment, while other
platforms like Kubeflow and TFX use Kubernetes [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] for orchestration purposes. SageMaker
offers a managed service approach when multiple user-friendly options are available.
2. Supported Libraries: BentoML supports wide range of tools including TensorFlow, PyTorch, and
Scikit-learn, making it versatile for different types of ML projects. In contrast, cloud tools like
SageMaker support various frameworks, but do not explicitly mention them. TFX platform is
designed specifically for TensorFlow [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], and this may limit its applicability in projects that use
other ML libraries.
3. Scalability: Both Kubeflow and TFX platforms provide exceptional scalability options due to
their Kubernetes-based architecture, making them suitable for large-scale ML operations [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>
        It's worth noting that scalability comes with increased complexity in setup and maintenance.
4. Ease of Use: BentoML is noted for its user-friendly API, which is beneficial for developers
looking for simplicity. In comparison, Kubeflow platform has a steeper learning curve due to its
comprehensive features. This trade-off between ease of use and feature richness is a crucial
consideration for teams choosing an MLOps tool.
5. Focus Areas: Each tool has its unique attention points. BentoML is primarily aimed at model
serving and deployment, while MLflow emphasizes on tracking and registry capabilities [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
Kubeflow covers the entire MLOps lifecycle, making it a more holistic solution [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The choice
of tool often depends on the specific needs of the project and the existing infrastructure.
      </p>
      <p>The core architecture consists of several components: the Model Registry for managing model
versions, the API Server for serving predictions, and Docker module for containerization support.
This modular design enables seamless scaling and management of machine learning models in
production. Additionally, BentoML's architecture includes features for model versioning, allowing
for easy rollback and A/B testing of different model versions.
10. BentoML architecture overview
The logic behind BentoML architecture is centered on simplifying the deployment process while
maintaining many flexibility options. By packaging models into a single service unit, BentoML
reduces the complexity associated with deployment of machine learning models. The service can
36
Supported</p>
      <p>Libraries
TensorFlow, PyTorch,</p>
      <p>scikit-learn
TensorFlow, PyTorch
TensorFlow, PyTorch,</p>
      <p>MXNet</p>
      <p>Various
TensorFlow</p>
      <p>Very High
be defined using simple Python decorators, allowing the data scientists to focus on model
development rather than deployment intricacies.</p>
      <p>When users train a model using popular ML libraries such as TensorFlow or PyTorch, they can
create a Bento Service by defining an inference function and specifying input/output types. This
service can then be serialized and stored in the Model Registry for future use. The Model Registry
not only stores the model but also maintains metadata about the model's performance, training
data, and hyperparameters, facilitating reproducibility and traceability.</p>
      <p>
        The BentoML architecture consists of several key components [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]:
1. Bento Service: The fundamental unit in BentoML is Bento Service, which encapsulates a trained
ML model along with its inference logic and dependencies. Such service can be easily deployed
as a REST API or even gRPC endpoint. The Bento Service also includes pre-processing and
postprocessing logic, ensuring that data transforms are consistent between training and inference.
2. Model Registry: BentoML includes a Model Registry that manages different versions of models.
      </p>
      <p>This feature allows teams to track model lineage and facilitates rollback to previous versions if
necessary. The registry also supports tagging and metadata management, thus simplifying the
identification of models for specific use cases or experiments.
3. API Server: The server exposes the model's inference capabilities through standardized HTTP
endpoints, making it accessible for client applications. It handles request parsing, input
validation and error handling, providing a robust interface for model serving.
4. Containerization: BentoML supports containerization through Docker, enabling users to create
portable images that encapsulate the entire environment required to run the model. This
includes not just the model itself, but also all dependencies, ensuring consistency and reliability
across different deployment environments.
5. Deployment Options: The users can deploy their Bento Services on various platforms, including
cloud services like AWS, Azure or Google Cloud. In addition, this can be done on Kubernetes
clusters using the orchestration tools like Yatai. As an expandable and versatile tool BentoML
supports edge deployment for IoT devices and mobile applications.
6. Monitoring and Logging: BentoML integrates with popular monitoring instruments to provide
the insights into model performance, HTTP request latency and resource utilization. This
component is crucial for maintaining model health and detecting drifts in production environment.
7. Adaptive Batching: To optimize the performance BentoML implements adaptive micro-batching
that dynamically adjusts batch sizes depending on incoming request patterns and available
computing resources. This feature significantly improves throughput for high-volume services.
The following entity diagram illustrates the architecture described above:</p>
      <p>
        This architecture provides a comprehensive solution for model deployment, addressing the key
challenges in MLOps such as versioning, scalability, and integration with existing infrastructure.
By abstracting away many of the complexities of deployment, BentoML allows data scientists and
engineers to focus on model development and improvement, ultimately accelerating the lifecycle.
11. Deployment on the scale with Yatai
Yatai, an advanced platform developed by BentoML is designed for seamless model deployment on
Kubernetes. Its architecture integrates basic principles of scalability, modularity and flexibility,
making it highly efficient for managing complex machine learning workloads in production [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
Yatai leverages Kubernetes autoscaling capabilities to dynamically allocate resources, ensuring that
models run efficiently in many diverse computational environments, for example when GPUs are
used for inference, and CPUs just for preprocessing.
      </p>
      <p>
        One of the fundamental architectural principles of Yatai is its support for microservice-based
deployments [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Each model component is containerized, thus enabling modular deployment and
maintenance. This microservice architecture allows for independent scaling of different parts of the
system, optimizing the resource usage depending on workload intensity. For instance, inference
workloads may scale on demand with GPU-based microservices, while preprocessing tasks can still
rely on CPU-based services.
      </p>
      <p>
        Another significant principle is its support for version control and model lifecycle management.
Yatai stores different versions of machine learning models, ensuring seamless rollbacks or updates,
making it easier to manage the production environments. These versioning capabilities are tightly
integrated with BentoML framework for model hosting. Apparently, this feature streamlines the
continuous integration and continuous deployment (CI/CD) workflows. Yatai incorporates many
observability features, offering detailed logging, monitoring and tracing capabilities to ensure that
models perform as expected in production [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. This includes:
      </p>
      <p>In addition, the Yatai architecture supports advanced techniques like adaptive micro-batching.
This approach improves throughput by batching the requests dynamically depending on system
load. This feature is particularly useful for models serving the real-time predictions as it balances
response time with computational efficiency. Yatai provides built-in tools for rolling deployments,
so that updates can be pushed without the downtime. Also, it provides integration with cloud
platforms, offering flexible infrastructure support. Here are some additional features covered:
•
•
•</p>
      <sec id="sec-10-1">
        <title>Canary deployments for gradual rollouts and zero-downtime updates</title>
        <p>Blue-green deployments for capability to revert the version</p>
        <p>Traffic splitting for controlled experiments</p>
        <p>By leveraging Kubernetes orchestration and BentoML model serving capabilities Yatai makes
the deployment of machine learning models simple, scalable and more efficient. Its architecture is
purpose-built to handle the complexity of machine learning workflows while maintaining high
operational efficiency, especially in production environments.
12. Prediction model deployment
As a first step, we need to load the trained model serialized in .onnx format into local models
registry. In order to simplify the parameter configuration the Config class has been introduced.
This class can store various settings such as model path, API endpoints and environment variables.
onnx_model = onnx.load("models/split-histo.onnx")
bento_model = bentoml.models.get(CONFIG.MODEL)
bentoml.onnx.save_model(CONFIG.MODEL, onnx_model)</p>
        <p>It's important to note that the choice of ONNX (Open Neural Network Exchange) format allows
the interoperability between different deep learning frameworks, enhancing the model portability.
As a next step in BentoML architecture, we need to define a service class and deploy it to runners.
This approach encapsulates the model and its inference logic into a deployable unit.
self._runner = bentoml.onnx.get(model).to_runner()
self._service = bentoml.Service(service, runners=[self._runner])
self._service.api(input=NumpyNdarray(), output=JSON())(self.predict)
The use of runners allows for efficient resource allocation and parallel processing of requests,
which is crucial for handling high-volume prediction tasks. In order to handle the forecasting
requests in real-time a RESTful API service can be developed using Python-based technology
FastAPI. This framework is chosen for its simplicity and speed, making it ideal for serving machine
learning models in production. FastAPI offers remarkable performance due to its asynchronous
features and effective request management. The documentation is generated automatically, which
reduces the development time and increases API comprehension. The API is expected to receive
POST requests with relevant market data from the client application, it interacts with preloaded
forecasting model and returns the predictions in real-time.</p>
        <p>app = FastAPI()
app.include_router(dummy_router)
predictor = Predictor(CONFIG.SERVICE, CONFIG.MODEL)
predictor._service.mount_asgi_app(app)
svc = predictor._service # Entry point for the bentofile</p>
        <p>The service is configured to handle multiple concurrent requests, providing energy traders and
market analysts with near-instantaneous predictions. The asynchronous nature of FastAPI request
handling loop ensures that the service can handle high demands of live market environments. Also,
the API can be extended to include:</p>
        <p>The key element within BentoML development lifecycle is bentofile.yaml a configuration file
defining packaging methods and input service. It also allows to reference the aforementioned svc
object and requirements.txt file with all libraries required by running process.</p>
        <p>• Input validation to ensure data quality
• Rate limiting to prevent service abuse
• Authentication and authorization for secure access
• Caching mechanisms for frequently requested predictions
service: "main:svc"
[...]</p>
      </sec>
      <sec id="sec-10-2">
        <title>The bentofile.yaml can be customized further to include:</title>
      </sec>
      <sec id="sec-10-3">
        <title>Environment variables for different deployment stages</title>
        <p>Resource requirements (CPU, memory, GPU)
Health check endpoints</p>
        <p>Logging and monitoring configurations</p>
        <p>The uniform service deployment procedure for different environments can be easily achieved
when Bento service is encapsulated in a Docker container. So, the next stage in the deployment
pipeline is service containerization. Docker ensures that dependencies like system libraries and
environment variables are bundled together, providing consistency between development, testing,
and production environments. The BentoML build command allows to create the docker image and
push it to local docker registry. This containerized inference model can be used in a simple
dockercompose.yml to define the local deployment environment.
version: "3.9"
services:
energizer:
image: split-histo:latest
ports:</p>
        <p>- 3000:3000
restart: on-failure
networks:</p>
        <p>- energizer
networks:
energizer:</p>
        <p>name: energizer
This docker-compose setup can be enhanced with:</p>
      </sec>
      <sec id="sec-10-4">
        <title>Volume mounts for persistent storage</title>
        <p>Environment-specific configurations
Integration with monitoring services</p>
        <p>Load balancing for high-availability setups</p>
        <p>Once the container image is created in registry, it can be deployed into Kubernetes cluster
using the orchestration platform Yatai. Kubernetes manages the deployment, autoscaling and
maintenance of the containers, it ensures that they remain available and responsive. This setup
allows the model to handle real-time requests with low latency, making it suitable for
highfrequency market predictions. The Kubernetes deployment can be further optimized by the
following steps:</p>
        <p>Implementation of horizontal pod autoscaling that can be based on CPU or memory usage
Configuration of network policies for enhanced security
Definition of persistent volumes for model storage and caching</p>
        <p>Integration with cloud-native monitoring and logging solutions</p>
        <p>As a summary, the electrical energy market forecasting model can be efficiently deployed,
scaled and managed when following the workflow described above. This architecture can provide
reliable and timely predictions that can support trade decisions in the volatile energy market
landscape.
13.Conclusion
It was demonstrated in this article that development of production quality forecasting solution
requires multiple steps: data preprocessing and augmentation, selection of input parameters,
selection of machine learning algorithm, hyperparameter optimization, model training and
serialization, adding of REST API layer, creation of docker image, networking and autoscaling
configuration, deployment of the service into Kubernetes cluster. Apparently, this list is not
comprehensive.</p>
        <p>While some tools like scikit-learn library and BentoML platform used in this research are
Python-based, many other tools are cross-platform, this includes Docker and Kubernetes. It is
important to note that major software vendors take the interoperability and reliability quite
seriously and invest considerable resources into platform-independent solutions like ONNX
standard. For instance the winner algorithm HistogramGradientBoosting implemented in Python
has equivalent implementation in .NET called LightGBM. It is advantageous to be able to develop
the model in one programming language and deploy to environment that is matching better the
skill set of infrastructure team.</p>
        <p>The forecasting algorithm could still be placed into simple application that can be launched
even from the console. What are the benefits of employing complex technology stack that is
proposed in the article? The first aspect is the ability to get the forecast on a remote device, this
can be another computer, mobile device or a web page. Also, the prediction can be customized for
specific end user. Another important aspect is scalability. The business requirements for the
current task are limited to forecasting the trade volume once an hour, but this is just an example.
Once the forecasting algorithm is available it is beneficial to leverage it within the enterprise. So,
autoscaling and resiliency become important features affecting the company's financial goals.</p>
        <p>Another consideration is design and development of such custom forecasting solution. From the
perspective of agile project planning the top-down decomposition of implementation tasks is much
more productive than starting a development process with unknown stages and many technical
challenges. It is helpful when the artifacts that should be passed in technological chain from one
stage to another are known and well specified. This information helps to separate implementation
tasks and allows to speed up project development with parallel streams.</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>Acknowledgements</title>
      <p>The authors are grateful to the scientists of G. E. Pukhov Institute of Energy Modeling for
providing the hourly data on electricity markets in Ukraine.</p>
    </sec>
    <sec id="sec-12">
      <title>Declaration on Generative AI</title>
      <sec id="sec-12-1">
        <title>The authors have not employed any Generative AI tools.</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            <surname>Hands-On Introduction</surname>
          </string-name>
          to Machine Learning, 1st. ed., Cambridge University Press, Cambridge,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] Scikit-learn: Machine Learning in Python</article-title>
          . URL: https://scikit-learn.org/stable/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>[3] A new model of the electricity market has been launched in Ukraine</article-title>
          . URL: https://expro.com.ua/en/tidings/a
          <article-title>-new-model-of-the-electricity-market-has-been-launched-inukraine.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Ilyash</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Salashenko</surname>
          </string-name>
          ,
          <article-title>Does the Ukrainian electricity market correspond to the European model?</article-title>
          ,
          <source>Utilities Policy</source>
          <volume>79</volume>
          (
          <year>2022</year>
          ),
          <fpage>1</fpage>
          <lpage>14</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.jup.
          <year>2022</year>
          .
          <volume>101436</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Doroshenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Savchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Yatsenko</surname>
          </string-name>
          ,
          <article-title>Application of machine learning techniques for forecasting electricity generation and consumption in Ukraine</article-title>
          ,
          <source>in: Proceedings of IT&amp;I</source>
          <year>2023</year>
          ,
          <year>2023</year>
          , pp.
          <fpage>136</fpage>
          <lpage>146</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3624</volume>
          /Paper_12.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Doroshenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Haidukevych</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Haidukevych</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Yatsenko</surname>
          </string-name>
          ,
          <article-title>Forecasting Electrical Energy Consumption for 24 Hours Ahead at Country Scale</article-title>
          ,
          <source>in: Proceedings of UkrPROG</source>
          <year>2024</year>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Levinson</surname>
          </string-name>
          .
          <article-title>Three Approaches to Encoding Time Information as Features for ML Models</article-title>
          .
          <source>Nvidia Developer Technical Blog</source>
          ,
          <year>2022</year>
          . https://developer.nvidia.com/blog/three
          <article-title>-approachesto-encoding-time-information-as-features-for-ml-models/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Haykin</surname>
          </string-name>
          ,
          <article-title>Neural networks: a comprehensive foundation</article-title>
          , Prentice Hall,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          ,
          <article-title>Statistical learning theory</article-title>
          , Wiley,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>C. M. Bishop</surname>
          </string-name>
          ,
          <article-title>Pattern recognition and machine learning (</article-title>
          <source>Information Science and Statistics)</source>
          , Springer,
          <year>2006</year>
          . https://www.microsoft.com/en-us/research/uploads/prod/2006/01/
          <string-name>
            <surname>BishopPattern-Recognition-</surname>
          </string-name>
          and
          <string-name>
            <surname>-</surname>
          </string-name>
          Machine-Learning-
          <year>2006</year>
          .pdf
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>[11] Introduction to BentoML. https://docs.bentoml.com/en/latest/get-started/introduction.html.</mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Kubernetes</given-names>
            <surname>Cluster</surname>
          </string-name>
          <article-title>Architecture</article-title>
          . https://kubernetes.io/docs/concepts/architecture/.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13] Yatai tool: https://bentoml.com/blog/yatai-10
          <string-name>
            <surname>-</surname>
          </string-name>
          model
          <article-title>-deployment-on-kubernetes-made-easy/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>N.</given-names>
            <surname>Klingler. ONNX (Open Neural Network Exchange) Explained</surname>
          </string-name>
          : A New Paradigm in
          <source>AI Interoperability</source>
          ,
          <year>2023</year>
          . https://viso.ai/computer-vision/onnx-explained/.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <article-title>Kubeflow architecture</article-title>
          and principles: https://www.kubeflow.org/docs/started/architecture/.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>[16] MLFlow introduction: https://mlflow.org/docs/latest/getting-started/index.html.</mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>[17] TFX Guide: https://www.tensorflow.org/tfx/guide.</mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>[18] Yatai key principles: https://docs.yatai.io/en/latest/concepts/architecture.html.</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>