1. Introduction

Machine learning-based information technology for analyzing energy peaks in power grid balancing⋆

Dmytro Tymoshchuk

dmytro.tymoshchuk@gmail.com 1

Andrii Voloshchuk

Andriy Sverstiuk

andriy.voloschuk30@gmail.com 0

Halyna Osukhivska

osukhivska@tntu.edu.ua 1

Oksana Bahrii-Zaiats

0 0 I. Horbachevsky Ternopil National Medical University , Maidan Voli St., 1, Ternopil, 46002 , Ukraine 1 Ternopil Ivan Puluj National Technical University , Ruska str. 56, Ternopil, 46001 , Ukraine

2026

The article presents an information technology based on machine learning methods for detecting energy peaks and automatically balancing the power grid by activating storage installations or renewable energy sources. The study is based on hourly electricity consumption data for a month, described by nine statistical descriptors of amplitude variability and the LEC indicator with two classes of balance. A comparative analysis of five machine learning models (SVM, kNN, Random Forest, MLP, XGBoost) with the selection of hyperparameters by the Grid Search method and 5-fold cross-validation was conducted, where the target metric was the F1-score. The best results were obtained for the XGBoost model (Accuracy ≈ 0.961), which indicates its high ability to recognize balanced (class1) and unbalanced (class2) power consumption modes. Permutation Feature Importance analysis confirmed that variability descriptors (Range, Std_Dev, Max) are crucial for classifying energy anomalies. The approach provides timely detection of unstable regimes and reduces false alarms, increasing the stability and reliability of the power system.

eol>energy peaks machine learning power grid balancing information technology 1

1. Introduction

Modern energy systems are becoming increasingly complex, as they combine various sources of generation, control systems and consumers with dynamic operating modes. Such increasing structural complexity makes them more vulnerable to external influences and fluctuations in electricity consumption. The level of load on the electricity grid is determined by a set of factors that directly affect the behavior of consumers and may depend on standard daily, weekly and seasonal cycles, changes in weather conditions and other factors. Periods of extreme temperatures lead to a significant increase in electricity consumption due to increased use of heating or air conditioning systems. In addition, consumption is influenced by socio-economic factors, holiday periods, mass events and changes in industrial production, emergencies, etc., which create peak loads and affect the stability of the operation of the energy system [ 1,2 ].

Ensuring stable operation of the power grid in conditions of such fluctuations requires the implementation of modern approaches to energy balancing and load forecasting. The use of renewable energy sources (solar, wind generation) and energy storage installations allows smoothing peak loads, compensating for short-term power shortages and increasing the flexibility of the system. In this context, the implementation of intelligent energy metering systems becomes an important step towards effective monitoring of energy consumption and distribution at the level

0000-0003-0246-2236 (D. Tymoshchuk); 0009-0007-1478-1601 (A. Voloshchuk); 0000-0001-8644-0776 (A. Sverstiuk); 0000-0003-0132-1378 (H. Osukhivska); 0000-0002-5533-3561 (O. Bahrii-Zaiats) of cities and regions [ 3-5 ]. For their effective management, the accuracy of input data and the adequacy of models are critically important, which is a fundamental aspect in complex technical systems [ 6 ].

To solve these complex problems, it is worth using modern technologies, in particular artificial intelligence (AI). Machine learning (ML) methods have become widespread in various fields - from medicine [ 7 ], finance [ 8 ] and materials science [ 9 ] to transport [10] and cybersecurity [11]. They allow to automate data analysis processes, identify hidden patterns, increase the accuracy of forecasts and make informed decisions based on large amounts of information. In the energy sector, these technologies play an important role in increasing the efficiency of energy solutions. They are used to forecast consumption, detect anomalies, optimize equipment operating modes and increase network stability. The integration of ML algorithms into peak load analysis processes opens up new opportunities for more accurate prediction of system behavior and timely detection of instability risks.

Unlike traditional statistical methods, modern machine learning models are able to recognize hidden patterns in data, which allows developing more effective strategies for managing energy systems, ensuring stable operation of the power grid, and reducing energy supply costs. In the authors’ previous studies [12], a computer system for energy distribution under electricity shortage conditions was developed using AI.

Modern approaches to load forecasting, including meta-learning frameworks for selecting optimal models [13], contribute to improving the reliability of power systems. Comprehensive reviews of the application of deep learning for intelligent demand management and load balancing in smart grids [14] confirm the growing role of ML in improving the reliability of power systems.

To assess the relevance of research on the use of ML methods in increasing the efficiency of energy solutions, an analytical query TITLE-ABS-KEY(("energy peak" OR "power system stability" OR "load forecasting" OR "power grid balancing" OR "energy management system") AND ("machine learning" OR "Random Forest" OR "XGBoost" OR "MLP" OR "neural network" OR "LSTM" OR "Transformer" OR "GNN")) was formulated in the Scopus scientometric database. According to the results of the search query on this topic, 14,372 scientific papers were found in the Scopus scientometric database, of which 9,137 were found in the last 10 years from 2015 to 2024 (Figure 1).

The largest number of literary sources on the topic under study has been observed in the last 3 years. In particular, in 2022 - 1328, 2023 - 1675, 2024 - 2187, which confirms the relevance of researching this problem and the constant growth of interest in it worldwide.

2. Related Work

The current state of research in power system stabilization is characterized by the rapid development of AI and ML methods for load forecasting, anomaly detection, and control optimization. The presented review covers key scientific achievements that demonstrate a variety of approaches: from hybrid statistical models [15] to innovative deep learning architectures such as graph neural networks and transformers.

A hybrid approach to long-term forecasting with hourly resolution is proposed, namely, combining classical statistical regression models to describe the underlying data structure (taking into account temperature and calendar factors) with a Long Short-Term Memory (LSTM) network for modeling and correction of residual error [16]. Developing the idea of time series analysis, a hybrid architecture combining convolutional neural networks (CNN) and LSTM for predicting electricity consumption in residential buildings is presented [17]. This approach allows for the effective extraction of both local patterns and temporal dependencies in consumption data. An approach based on XGBoost and factorization machine is proposed to assess the transient stability of power systems [18]. This method allows efficient processing of high-dimensional system state data and provides fast and accurate classification of power grid stability in real time. The use of graph neural networks for modeling the topological properties of power grids [1 9] allows to take into account spatial relationships between different nodes of the system and to detect hidden patterns in load distribution, which is especially important for the analysis of cascading failures and network development planning. An approach to detecting anomalies in distributed power grids based on autoencoders and federated learning is proposed, which provides decentralized learning without transferring private data [20]. In this way, LSTM recurrent networks that model temporal dynamics and CNN for extracting local patterns in the data are combined. In addition, a data fusion technique is used, which provides the ability to combine consumption information with meteorological and other external factors. In this case, higher forecast accuracy is provided compared to classical models such as ARIMA or Random Forest. Recent research demonstrates a wide range of ML approaches for power system analysis. Deep learning, in particular Transformerbased architectures, combined with generative adversarial networks, show high efficiency for detecting anomalies in load time series [21]. Comparative analysis of the performance of Random Forest and XGBoost under different class imbalance conditions shows that gradient boosting generally outperforms traditional ensemble methods in power system classification problems [22]. A review of AI methods for assessing the dynamic stability of power systems, including deep learning-based approaches that allow classifying different types of disturbances and predicting the behavior of the system in critical modes, is presented in [23]. To further improve the accuracy of anomaly detection, the Transformer-GAN model was developed [24]. The architecture combines the Transformer module, which uses a self-attention mechanism to capture long-term dependencies, and a generative adversarial network (GAN), where the generator learns normal data patterns. A systematic review of the application of deep learning for intelligent demand response [25] demonstrates the effectiveness of DL methods for real-time load forecasting and demand management, which is critical for smart grid balancing and renewable energy integration. The application of ML for real-time load management demonstrates the potential of intelligent systems to improve the efficiency of smart grids [26].

These studies demonstrate the significant potential of using AI to improve the reliability of power systems through peak load balancing, in particular with the aim of using alternative sources of electricity or energy storage facilities.

The aim of this work is to develop information technology based on ML methods to detect energy peaks and the need to automatically connect additional renewable energy sources or energy storage facilities to prevent failure of energy nodes and increase the stability and balance of power grids.

3. Methodology

To ensure effective real-time monitoring and automatic balancing of the power grid, a comprehensive information technology is proposed that integrates three functional levels into a unified architecture (Figure 2).

The operation of the system starts at the first level, which performs continuous data acquisition on electricity consumption from a distributed network of metering devices. Data sources include smart energy meters at residential and industrial facilities, voltage and current sensors at substations, telemetry from renewable energy sources (RES), and monitoring systems of energy storage facilities (ESF). Data are collected with hourly resolution and transmitted over secure VPN channels using industrial communication protocols to the central node of the system, where they are stored in specialized databases. The next stage at the second level is intelligent information processing. Based on validation results and the evaluation of accuracy metrics, the system selects a machine learning model that is integrated into the decision-making core. This model subsequently analyzes energy consumption patterns in real time and classifies the current state as “balanced” (absence of critical peaks) or “unbalanced” (presence of anomalies). This approach ensures that power system control is carried out by the most effective algorithm, providing high accuracy in threat detection and minimizing false alarms. The final component of the architecture is the third level, the decision-making level, which implements the physical balancing of the grid. The model classifies the current state of the power system as “balanced” or “unbalanced” (in the presence of critical peaks). Depending on the classification result, an automatic control scenario for distributed resources is executed. When a balanced state is detected, the system activates a mode of accumulating surplus energy in storage units and performs preventive monitoring of potential load peaks. In the case of detecting an unbalanced state with critical consumption peaks, the system automatically initiates compensatory actions: increasing generation from renewable energy sources and discharging energy storage units to rapidly cover load peaks. Such an architecture ensures energy balance management with a minimal response time to critical changes in the power grid.

The study is based on electricity consumption data from a regional energy company, which are presented in the form of hourly measurements. Figure 3 shows a portion of the data obtained for the period from April 20 to 26, 2025, with the peak amplitude values indicated (red). The dataset formed from the maximum (peak) values of electricity consumption is used for the analysis.

Ten statistical descriptors of the amplitude variability of electricity consumption were used as input parameters for the ML model: Mean (arithmetic mean), Median (median), Min / Max (minimum / maximum), Range (span), Std_Dev (standard deviation), SE (standard error), Sk (asymmetry), Kurt (kurtosis), LEC (Level of Electric Consumption). Mean is a measure of central tendency, reflecting the average level of load or frequency deviation. Median is a robust characteristic of central tendency, resistant to the presence of outliers. Min / Max define the boundaries of the operating range during operation. Range is the difference between the maximum and minimum values, a measure of the total magnitude of fluctuations. Std_Dev quantifies the volatility or dispersion of load/frequency. SE reflects the stability of the mean value within the window. Sk is a characteristic of the skewness of the distribution, which can indicate sudden increases or decreases in indicators. Kurt is a measure of the peakedness of the distribution, which identifies the presence of extreme outliers (sharp spikes or dips).

The initial parameter in the study is LEC — an indicator that reflects the level of balance of the electrical network. This parameter characterizes the current state of the system, which is divided into two classes: balanced (class 1) and unbalanced (class 2). The first class describes the operation mode of the electrical network with low variability of consumption, that is, when electricity consumption is stable and low, which corresponds to the predicted indicators, and requires redirecting excess electricity to an energy storage facility to ensure stable operation of the electrical network. The second class corresponds to a mode with high variability, characterized by significant fluctuations between the minimum and maximum consumption values within an hour, uneven load or signs of instability, which may indicate that the permissible parameters of the power system are exceeded. When classifying the state as "need for balancing" (class 2), there is a need to connect additional renewable energy sources or energy resources of ESF to balance the load of the power grid as smoothly as possible, ensuring stable operation of the power grid.

The distribution of electricity consumption data by class is shown in Figure 4.

The generated dataset contained 2400 samples, evenly distributed between two classes: 1200 samples of class 1 and 1200 samples of class 2. The classes were formed taking into account the threshold value of electricity consumption, which for this implementation was 0.0749 MW. The data was structured in such a way that if the value of electricity consumption exceeded the threshold calculated as the sum of the average value of the studied period, then they were considered "peak". To build and evaluate the effectiveness of ML models, the generated dataset was divided into training and test samples in a ratio of 70/30 while preserving the proportions of the target variable. The distribution was performed according to the principle of stratified splitting with a fixed parameter random_state = 32, which guarantees the reproducibility of the results and uniform representation of each class in both subsamples.

The work uses five ML algorithms: Support Vector Machine (SVM) [27], k-Nearest Neighbors (kNN) [28], Random Forest (RF) [29], Multilayer Perceptron (MLP) neural network [30], and Extreme Gradient Boosting (XGBoost) [31]. Random Forest provides high reliability when analyzing interrelated parameters by using an ensemble of independent decision trees. Each tree is trained on a random subset of features and data, which reduces the impact of multicollinearity and random noise. This approach increases the stability and generalization ability of the model. XGBoost is optimized for fast prediction and is able to work effectively in conditions where the system needs to respond quickly to changing modes. The algorithm is based on the sequential construction of decision trees, which gradually reduce the error of previous models. Due to its high performance, parallel computing, and efficient memory usage, XGBoost is often used for tasks that require fast real-time decision-making. Due to its architecture and nonlinear activation functions, MLP can reproduce complex dependencies that are not detected by traditional methods. kNN is considered a baseline method based on the principle of similarity between samples. Each new object is classified depending on the classes of its nearest neighbors in the feature space. This approach makes it possible to assess how clearly the classes are separated in the given feature set and how well the constructed descriptors reflect the characteristics of the system’s energy states. SVM is a classic method for constructing an optimal separating hyperplane that maximizes the distance between classes in the feature space. Comparative analysis of different ML algorithms is standard practice for selecting the optimal model, especially important when solving problems for critical infrastructure, such as the power system, where the reliability of classification is of paramount importance.

To solve the problem of classifying the balance states of the power grid, a software solution was developed in Python, which uses the scikit-learn and XGBoost ML libraries. StandardScaler was also used to normalize the input data before training the kNN, SVM, and MLP models, which ensured a single scale of features and increased the stability of the training process. For the ensemble models Random Forest and XGBoost, data normalization was not performed, since these algorithms are insensitive to the scales of the input parameters. To understand the decision-making mechanisms of the model and determine the most informative statistical descriptors, the Permutation Feature Importance (PFI) global analysis method was used [32]. This approach allows us to quantitatively assess the contribution of each feature to the formation of the forecast by measuring the change in the model accuracy after a random violation of the connection between a specific descriptor and the target variable. The main idea is that if a certain feature has a significant impact on the result, then its random mixing will lead to a noticeable decrease in classification efficiency. The PFI method belongs to the global explainability methods and does not depend on the type of model. Its advantages are simplicity of implementation, intuitive interpretation of results and the ability to compare the influence of different features. The main limitation is the increased computational complexity associated with the need for multiple predictions. Despite this, PFI remains one of the most effective methods for assessing the informativeness of descriptors in explainable ML tasks.

The performance of the models was assessed through the analysis of the confusion matrix, which systematizes the results of predictions into four categories. True positive results (TP) record cases of correct detection of electricity consumption peaks, while true negative (TN) reflect the correct identification of normal modes. First-order errors (FP) characterize false signals about peaks, and second-order errors (FN) - missed critical states of the power system. From these basic indicators, key performance metrics are formed: Accuracy, Recall, Specificity, Precision, F1-Score and geometric mean G-Mean [33].

Classification models play a critical role in the tasks of monitoring electrical networks of efficiency metrics, since their balance depends not only on the accuracy of diagnostics, but also on the timeliness of the response of the control system. In the case of recognizing peak power consumption states (class 2) and stable modes (class 1), these indicators are directly related to the reliability of automatic connection of RES or ESF, as well as to the prevention of overloads and failure of critical network elements. Accuracy reflects the proportion of correctly classified states among all forecasts. In the context of energy systems, a high Accuracy value indicates the model’s ability to correctly recognize both normal and peak modes, which ensures the reliability of the overall monitoring system. However, this metric by itself may not be informative enough in the case of unbalanced data, when the number of normal states significantly exceeds the number of peak states. Recall is a key metric for this task, as it characterizes the model’s ability to detect all peak load cases. High Recall minimizes the number of missed critical situations (FN). From the point of view of operational security of power grids, Recall is of priority importance, since even one missed peak can cause cascading failures or blackouts. Specificity reflects the model’s ability to correctly identify normal operating modes of the system. High Specificity prevents false activations of balancing systems, which reduces the number of unnecessary RES or ESF switching cycles. This is especially important for the economic efficiency of the power system, since each unnecessary operation entails additional energy costs and accelerates equipment wear. Precision characterizes the proportion of real peak states among all those that the model has identified as critical. High Precision means that the system reacts only to real threats, and not to random fluctuations in consumption. Thus, it reduces the number of false positive states (FP), optimizes the use of balancing resources, and maintains the efficiency of energy flow management. F1-Score is a harmonious average between Precision and Recall, which provides a generalized assessment of the balance between detecting all peak states and minimizing false alarms. For automatic grid balancing systems, a high F1-Score ensures that the algorithm is both sensitive to real threats and stable with respect to noise in the data. This metric is especially important at the stage of selecting the optimal model, when a compromise between security and efficiency of network operation must be found. G-Mean (geometric mean of Recall and Specificity) is used to assess the balance of the classification. A high G-Mean value indicates that the model recognizes both peak and normal modes equally well, without favoring any class. For power system control tasks, this means stable operation of the algorithm under different load conditions, including non-standard situations or variable consumption profiles.

In addition to the basic metrics, the integral indicators Area Under the ROC Curve (AUC) and Precision–Recall (PR) curve were used to comprehensively assess the effectiveness of the classification models. AUC reflects the ability of the model to distinguish between balanced and unbalanced power consumption modes at different decision thresholds. A high AUC value (close to 1) indicates a high discriminative ability of the algorithm. In the context of power systems, this means that the model is able to timely recognize the approach to critical network operating modes and prevent accidents by early load balancing. Precision–Recall curve provides a more detailed picture of the model’s behavior in conditions of class imbalance, when the number of peak states is relatively small compared to normal ones. Analysis of the area under the PR curve allows us to assess the trade-off between Precision and Recall. A high value of Average Precision (AP), which numerically corresponds to the area under the PR curve, indicates the model’s ability not only to effectively detect peaks, but also to minimize the number of false signals. This is particularly important in the context of automatic connection of renewable energy sources or energy storage installations, where false triggering can lead to unnecessary energy losses and reduced control system efficiency.

4. Results and Discussion

Five ML models were used to solve the problem of classifying power grid operating modes. The models were tuned by hyperparameter optimization before the training stage to achieve maximum classification accuracy. The search for optimal combinations of hyperparameters was carried out using the Grid Search method in combination with 5-fold cross-validation, which provided a reliable assessment of the generalization ability of the models on the training data set. F1-Score was chosen as the target optimization metric, since it takes into account both false positive (FP) and false negative (FN) results, providing a balanced ratio between Precision and Recall. The use of this metric is more appropriate compared to Accuracy, since it avoids the bias of the model towards the dominant class and better reflects the ability of the algorithm to correctly detect both stable and peak consumption modes. The results of tuning the main hyperparameters for each of the fiveML models are given in Appendix A, which presents the optimal parameter values obtained during the grid search process.

All five algorithms demonstrated the ability to classify energy regimes, but the models showed different levels of effectiveness.

Figure 5 shows the normalized confusion matrices (%) for the kNN and SVM models, which reflect the quality of the classification of power consumption states into two classes. (a) (b)

High agreement between actual and predicted labels is observed for the kNN model: the proportion of correct classifications is 92.78% for class 1 and 93.61% for class 2, indicating a balanced ability of the algorithm to identify both steady and peak consumption modes. In contrast, the SVM model shows lower accuracy for class 1 (73.89%) and a slight decrease in efficiency in recognizing peak states for class 2 (92.50%).

Figure 6 shows the normalized confusion matrices (%) for the Random Forest and MLP models.

The Random Forest model demonstrated high performance, providing 93.89% correct predictions for class 1 and 93.33% for class 2, indicating the stable ability of the ensemble of decision trees to recognize both normal and peak modes. The MLP model showed even better results: 94.44% correct classifications for class 1 and 95.28% for class 2.

Figure 7 presents the normalized confusion matrix (%) for the XGBoost model, which demonstrates the highest classification accuracy among all the algorithms considered.

The correct prediction rate is 95.00% for class 1 and 97.22% for class 2, indicating an exceptionally high ability of the model to recognize both stable and peak power consumption modes. Compared to previous models (kNN, SVM, Random Forest, MLP), XGBoost provides the lowest number of false classifications. The results confirm the superiority of gradient boosting in power consumption data analysis tasks, where high classification accuracy is important.

The performance results of the studied models for classifying electricity consumption peaks are summarized in Table 1

Analyzing the data in Table 1, the highest overall performance was demonstrated by the XGBoost model, achieving Accuracy = 0.9611, Recall = 0.9500–0.9722, and F1-Score ≈ 0.961. This indicates the high ability of the model to distinguish between steady-state and peak power consumption modes. The high and balanced Recall and Specificity values (0.9500–0.9722) confirm that the model is equally effective in detecting both normal and critical load states. Such accuracy is especially valuable for real-time monitoring systems, where missing or false detection of peaks can lead to overloading of nodes and power system failures. Ensemble (XGBoost, Random Forest) and neural approaches (MLP) outperform methods based on metric distances or hyperplane separation (kNN, SVM), confirming their ability to more effectively account for nonlinear multivariate relationships between electricity consumption parameters. This makes such models suitable for tasks such as automatic detection of load imbalances, activation of renewable energy sources or energy storage systems, and ensuring real-time grid stability.

Figure 8 shows the main performance curves of the XGBoost model, illustrating its ability to effectively classify power consumption modes and accurately distinguish between steady and peak system states.

(a)

(b) (c)

The ROC curve for the XGBoost model with a 95% confidence interval(Figure 7a) characterizes the model’s ability to distinguish between balanced (class 1) and unbalanced (class 2) power consumption modes. The area under the ROC curve (AUC = 0.9617) indicates a high discriminative ability of the model. An AUC value close to 1 indicates that XGBoost effectively distinguishes between stable and critical power system operation modes. The narrow 95% confidence interval confirms the robustness of the model to changes in the data and the low variability of the results during repeated estimations. Thus, the ROC curve confirms that XGBoost not only achieves high classification accuracy, but also provides reliable and balanced detection of power grid states in both classes. The Precision–Recall (PR) curve (Figure 7b) reflects the interdependence between Precision and Recall when classifying peak power consumption modes. The area under the curve (Average Precision, AP = 0.942) indicates excellent classification quality — the model maintains high values of both metrics over a wide range of thresholds. A narrow 95% confidence interval indicates stability of results and low variability of predictions. Thus, the PR curve confirms that XGBoost is able to effectively detect peak power consumption modes, ensuring a minimum number of false positives and maintaining high reliability of the decisions made. The curve of the dependence of the F1-score metric on the classification threshold for the XGBoost model (Figure 7c) demonstrates how the balance between Precision and Recall changes depending on the selected decision threshold. The maximum value of F1-score = 0.9615 is achieved at the optimal threshold of 0.42, which provides the best ratio between correct detection of peak states and minimization of false alarms. In the low range of thresholds, high Recall prevails at the expense of reduced Precision, while too high thresholds cause the model to lose sensitivity to critical states. Therefore, the chosen threshold of 0.42 is an optimal compromise between the two key performance indicators, ensuring the most effective performance of XGBoost in detecting peak power consumption modes.

To assess the contribution of each feature to forecasting and better understand the decisionmaking mechanisms of the XGBoost model, an analysis of the importance of features was conducted using the Permutation Importance method (Figure 9).

The analysis of the importance of the features showed that the key role in the classification is played by descriptors that describe the variability and range of electricity consumption. The most significant predictor for the classification of network balance was Range, which is logically justified, since energy peaks are characterized by large fluctuations from minimum to maximum load values. Of secondary importance are the standard deviation (Std_Dev) and the maximum value (Max), which also reflect the variability of energy parameters. Kurt, Min and SE have a moderate impact. The central characteristics (Med, Mean, Sk), which describe the position and symmetry of the distribution of energy consumption, turned out to be less informative for the classification, since they reflect only the average load level, while peak states are determined mainly by amplitude and variation indicators that record rapid and significant changes in electricity consumption. The results obtained are consistent with the physical nature of electrical peaks and support the hypothesis that dynamic characteristics of power consumption (how quickly and how much the load changes) are more informative for detecting anomalies than static characteristics (for example, the absolute level of power consumption). These results can be used not only to optimize classification algorithms, but also to configure monitoring sensors and data acquisition systems, power consumption where it is possible to increase the frequency of measurements and accuracy specifically for parameters that characterize load variability.

5. Conclusion

The paper proposes an information technology for detecting energy peaks and supporting automatic balancing of the power grid by connecting renewable energy sources or energy storage facilities. The hourly electricity consumption indicators of a regional energy company, which reflect daily load fluctuations, were used as the input data. Five ML models with hyperparameter optimization (Grid Search, 5-fold CV) were compared using the F1-Score metric. The best results were demonstrated by the XGBoost model (Accuracy = 0.9611, high F1-Score and G-Mean), which confirms its ability to consistently recognize both peak and normal modes of electricity consumption. The Permutation Feature Importance analysis showed that the key contribution to the classification is made by amplitude-variation features (Range, Std_Dev, Max), which reflect the intensity and dynamics of changes in electricity consumption. The proposed approach provides reliable differentiation between stable and peak states, reduces the number of false positives, and increases the timeliness of the control system response. The developed technology is suitable for integration into automated energy management systems (EMS) as an intelligent module, which increases the stability of the power system and reduces the risk of unplanned outages.

Declaration on Generative AI

During the preparation of this work, the authors used Grammarly in order to grammar and spell check, and improve the text readability. After using the tool, the authors reviewed and edited the content as needed to take full responsibility for the publication’s content. [10] T. Yuan, W. Rocha Neto, C. E. Rothenberg, K. Obraczka, C. Barakat, T. Turletti, Machine learning for next-generation intelligent transportation systems: A survey, Trans. Emerg.

Telecommun. Technol. 33(4) (2021). doi:10.1002/ett.4427. [11] Y. Klots, V. Titova, N. Petliak, D. Tymoshchuk, N. Zagorodna, Intelligent data monitoring anomaly detection system based on statistical and machine learning approaches, CEUR Workshop Proceedings 4042 (2025) 80–89. [12] A. Voloshchuk, D. Velychko, H. Osukhivska, A. Palamar, Computer system for energy distribution in conditions of electricity shortage using artificial intelligence, Proc. 2nd Int. Workshop on Computer Information Technologies in Industry 4.0 (CITI 2024) 3742 (2024) 66– 75, Ternopil, Ukraine. [13] Y. Li, S. Zhang, R. Hu, N. Lu, A meta-learning based distribution system load forecasting model selection framework, Appl. Energy 294 (2021) 116991. doi:10.1016/j.apenergy.2021.116991. [14] H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, W. Zhang, Informer: Beyond efficient transformer for long sequence time-series forecasting, Proc. AAAI Conf. Artif. Intell. 35(12) (2021) 11106–11115. doi:10.1609/aaai.v35i12.17325. [15] A. Voloshchuk, H. Osukhivska, M. Khvostivskyi, A. Sverstiuk, Application of periodically correlated stochastic processes for forecasting electricity consumption, Meas. Comput. Devices Technol. Process. 3 (2025) 393–403. doi:10.31891/2219-9365-2025-83-48. [16] W. Zhang, H. Quan, D. Srinivasan, Parallel and reliable probabilistic load forecasting via quantile regression forest and quantile determination, Energy 160 (2018) 810–819. doi:10.1016/j.energy.2018.07.019. [17] T.-Y. Kim, S.-B. Cho, Predicting residential energy consumption using CNN-LSTM neural networks, Energy 182 (2019) 72–81. doi:10.1016/j.energy.2019.05.230. [18] N. Li, B. Li, L. Gao, Transient stability assessment of power system based on XGBoost and factorization machine, IEEE Access 8 (2020) 28403–28414. doi:10.1109/access.2020.2969446. [19] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, P. S. Yu, A comprehensive survey on graph neural networks, IEEE Trans. Neural Netw. Learn. Syst. 32(1) (2021) 4–24. doi:10.1109/tnnls.2020.2978386. [20] K. Kea, Y. Han, T.-K. Kim, Enhancing anomaly detection in distributed power systems using autoencoder-based federated learning, PLOS ONE 18(8) (2023) e0290337. doi:10.1371/journal.pone.0290337. [21] M. Imani, A. Beikmohammadi, H. R. Arabnia, Comprehensive analysis of random forest and XGBoost performance with SMOTE, ADASYN, and GNUS under varying imbalance levels, Technologies 13(3) (2025) 88. doi:10.3390/technologies13030088. [22] H. Huang, Z. Li, H. Beng Gooi, H. Qiu, X. Zhang, C. Lv, R. Liang, D. Gong, Distributionally robust energy-transportation coordination in coal mine integrated energy systems, Appl.

Energy 333 (2023) 120577. doi:10.1016/j.apenergy.2022.120577. [23] P. Sarajcev, A. Kunac, G. Petrovic, M. Despalatovic, Artificial intelligence techniques for power system transient stability assessment, Energies 15(2) (2022) 507. doi:10.3390/en15020507. [24] J. Duan, Deep learning anomaly detection in AI-powered intelligent power distribution systems, Front. Energy Res. 12 (2024). doi:10.3389/fenrg.2024.1364456. [25] P. Boopathy, M. Liyanage, N. Deepa, M. Velavali, S. Reddy, P. K. R. Maddikunta, N. Khare, T. R.

Gadekallu, W.-J. Hwang, Q.-V. Pham, Deep learning for intelligent demand response and smart grids: A comprehensive survey, Comput. Sci. Rev. 51 (2024) 100617. doi:10.1016/j.cosrev.2024.100617. [26] M. Imani, A. Beikmohammadi, H. R. Arabnia, Comprehensive analysis of random forest and XGBoost performance with SMOTE, ADASYN, and GNUS under varying imbalance levels, Technologies 13(3) (2025) 88. doi:10.3390/technologies13030088. [27] Support vector machines, scikit-learn documentation. URL: https://scikit-learn.org/stable/modules/svm.html. [28] E. Kavlakoglu, What is the k-nearest neighbors algorithm?, IBM (online). URL: https://www.ibm.com/think/topics/knn. [29] RandomForestClassifier, scikit-learn documentation. URL: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.ht ml. [30] S. Haykin, Neural networks and learning machines, Pearson Education, 2009. [31] E. Kavlakoglu, E. Russi, What is XGBoost?, IBM (online). URL: https://www.ibm.com/think/topics/xgboost. [32] Permutation feature importance, scikit-learn documentation. URL: https://scikit-learn.org/0.24/modules/permutation_importance.html. [33] Classification performance metrics and indices, Online resource. URL: https://adriancorrendo.github.io/metrica/articles/available_metrics_classification.html. [34] O. Savenko, S. Lysenko, A. Kryshchuk, Y. Klots, Botnet detection technique for corporate area network, in Proceedings of the the IEEE 7th International Conference on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS) IEEE, 2013, pp. 363-368.

A. Hyperparameter Settings Histogram-based tree construction method that accelerates training and saves memory L2 regularization parameter used to reduce overfitting and stabilize learning Description/Purpose Number of nearest neighbors considered when classifying a new instance distance minkowski

Weighting scheme: each neighbor’s contribution is inversely proportional to its distance from the query point Distance metric used to measure similarity between samples Power parameter for the Minkowski metric; p = 2 corresponds to Euclidean distance. Number of decision trees in the ensemble Split criterion measuring node impurity based on the Gini index Maximum tree depth is unrestricted; each tree grows until all leaves are pure Feature selection

sqrt(n_features) method for splitting:

Automatically adjusts class weights based on their frequency in each bootstrap subsample to handle imbalance Enables bootstrap sampling (random sampling with replacement) for tree construction Minimum number of samples required to split an internal node Minimum number of samples required to be at a leaf node Description/Purpose Network architecture: three hidden layers with 16, 8, and 16 neurons, respectively Nonlinear activation function

Adaptive optimization algorithm (Adam) used for updating network weights adaptive

alpha validation_fraction n_iter_no_change

Parameter kernel

C gamma balanced

True 0.1 1300 50 Value rbf 1.0 scale Dynamically adjusts the learning rate based on validation error trends Initial learning rate for the optimizer L2 regularization parameter preventing overfitting Stops training when validation performance no longer improves Fraction of data reserved for validation during training Maximum number of training iterations. Number of epochs with no improvement before early stopping is triggered Description/Purpose Kernel type — Radial Basis Function (RBF) Regularization parameter controlling the trade-off between maximizing the margin and minimizing classification errors Kernel coefficient defining the influence radius of

individual training samples; automatically scaled as 1 / (n_features × Var(X))

[1]

K. P.

Amber ,

Ahmad ,

M. W.

Aslam ,

Kousar ,

Usman ,

M. S.

Khan , Intelligent techniques for forecasting electricity consumption of buildings , Energy 157 ( 2018 ) 886 - 893 . doi: 10 .1016/j.energy. 2018 . 05 .155.

[2]

Williams ,

Short , Electricity demand forecasting for decentralised energy management , Energy Built Environ . 1 ( 2 ) ( 2020 ) 178 - 186 . doi: 10 .1016/j.enbenv. 2020 . 01 .001.

[3]

D. B.

Avancini ,

J. J. P. C.

Rodrigues ,

S. G. B.

Martins ,

R. A. L.

Rabêlo ,

Al-Muhtadi ,

Solic , Energy meters evolution in smart grids: A review , J. Clean. Prod . 217 ( 2019 ) 702 - 715 . doi: 10 .1016/j.jclepro. 2019 . 01 .229.

[4]

Dileep , A survey on smart grid technologies and applications , Renew. Energy 146 ( 2020 ) 2589 - 2625 . doi: 10 .1016/j.renene. 2019 . 08 .092.

[5]

Rangel-Martinez ,

K. D. P.

Nigam ,

L. A.

Ricardez-Sandoval , Machine learning on sustainable energy: A review and outlook on renewable energy systems, catalysis, smart grid and energy storage , Chem. Eng. Res. Des . 174 ( 2021 ) 414 - 441 . doi: 10 .1016/j.cherd. 2021 . 08 .013.

[6]

Ahmad ,

Zhang ,

Yan , A review on renewable energy and electricity requirement forecasting models for smart grid and buildings , Sustain. Cities Soc . 55 ( 2020 ) 102052 . doi: 10 .1016/j.scs. 2020 . 102052 .

[7]

Herasymiuk ,

Sverstiuk , I. Kit , Multifactor regression model for prediction of chronic rhinosinusitis recurrence , Wiadomosci Lek . 76 ( 5 ) ( 2023 ) 928 - 935 . doi: 10 .36740/wlek202305106.

[8]

Ahmed , M. M. Alshater , A. E.

Ammari , H.

Hammami , Artificial intelligence and machine learning in finance: A bibliometric review , Res. Int. Bus. Financ . 61 ( 2022 ) 101646 . doi: 10 .1016/j.ribaf. 2022 . 101646 .

[9]

Tymoshchuk , I. Didych ,

Maruschak ,

Yasniy ,

Mykytyshyn ,

Mytnyk , Machine learning approaches for classification of composite materials , Modelling 6 ( 4 ) ( 2025 ) 118 . doi: 10 .3390/modelling6040118.