Evaluating Multi-task Curriculum Learning for Forecasting Energy Consumption in Electric Heavy-duty Vehicles

Evaluating Multi-task Curriculum Learning for Forecasting Energy Consumption in Electric Heavy-duty Vehicles October 2024 YuantaoFan fan@hh.se Center for Applied Intelligent Systems Research (CAISR) Kristian IV

:s väg 3 301 18 Halmstad Sweden

Halmstad University

2 ; Gropegårdsgatan 2 417 15 Göteborg Sweden, Volvo Group

SławomirNowaczyk slawomir.nowaczyk@hh.se Center for Applied Intelligent Systems Research (CAISR) Kristian IV

:s väg 3 301 18 Halmstad Sweden

Halmstad University

2 ; Gropegårdsgatan 2 417 15 Göteborg Sweden, Volvo Group

ZhenkanWang zhenkan.wang@volvo.com SepidehPashami sepideh.pashami@hh.se Center for Applied Intelligent Systems Research (CAISR) Kristian IV

:s väg 3 301 18 Halmstad Sweden

Halmstad University

2 ; Gropegårdsgatan 2 417 15 Göteborg Sweden, Volvo Group

Research Institutes of Sweden (RISE)

Isafjordsgatan 28 A 164 40 Kista Sweden

Santiago de Compostela Spain

Evaluating Multi-task Curriculum Learning for Forecasting Energy Consumption in Electric Heavy-duty Vehicles 1613-0073 October 2024 1301E33BD38EE4ABC48E5A21515EE4DD GROBID - A machine learning software for extracting information from scholarly documents Energy Consumption Forecasting Curriculum Learning Multi-task Learning Electric Vehicles

Accurate energy consumption prediction is crucial for optimising the operation of electric commercial heavy-duty vehicles, particularly for efficient route planning, refining charging strategies, and ensuring optimal truck configuration for specific tasks. This study investigates the application of multi-task curriculum learning to enhance machine learning models for forecasting the energy consumption of various onboard systems in electric vehicles. Multi-task learning, unlike traditional training approaches, leverages auxiliary tasks to provide additional training signals, which has been shown to enhance predictive performance in many domains. By further incorporating curriculum learning, where simpler tasks are learned before progressing to more complex ones, neural network training becomes more efficient and effective.

We evaluate the suitability of these methodologies in the context of electric vehicle energy forecasting, examining whether the combination of multi-task learning and curriculum learning enhances algorithm generalisation, even with limited training data. We primarily focus on understanding the efficacy of different curriculum learning strategies, including sequential learning and progressive continual learning, using complex, real-world industrial data.

Our research further explores a set of auxiliary tasks designed to facilitate the learning process by targeting key consumption characteristics projected into future time frames. The findings illustrate the potential of multi-task curriculum learning to advance energy consumption forecasting, significantly contributing to the optimisation of electric heavy-duty vehicle operations. This work offers a novel perspective on integrating advanced machine learning techniques to enhance energy efficiency in the exciting field of electromobility.

Introduction

Predicting energy consumption for electric vehicles (EVs), especially those used in commercial heavy-duty contexts, is paramount for improving their operational efficiency and promoting sustainability. Effective energy consumption forecasts are indispensable for strategic route planning, optimising charging protocols, and ensuring that vehicle configurations align well with specific operational demands. As electric vehicles gain traction as a viable and ecofriendly alternative to internal combustion engine vehicles, the importance of precise energy consumption predictions becomes increasingly pronounced. The challenges in this domain are multifaceted, stemming from the inherent variability in driving conditions, vehicle load, and diverse environmental factors, which collectively complicate the development of accurate predictive models. Overcoming these obstacles is essential not only for enhancing the reliability and performance of EVs but also for minimising operational costs and boosting the overall efficiency of electric transport systems.

The transition to electric vehicles is a significant step towards reducing greenhouse gas emissions and achieving sustainable transportation goals. However, since limited energy storage puts unique constraints on which operations are feasible, the benefits of EVs can only be fully realised through the development of specified forecasting methods that accurately anticipate energy needs. In this context, AI and ML emerge as transformative tools. AI-driven models can analyse vast amounts of data to uncover patterns and relationships that are not immediately apparent, providing more accurate and reliable energy consumption forecasts. These models can adapt to new data, continuously improving their predictions over time.

Nevertheless, energy consumption forecasting for EVs faces critical challenges, such as dynamic driving conditions and fluctuating loads, which makes even state-of-the-art methods struggle to handle complex real-world data effectively. While the potential to learn from historical data and identify trends that influence energy consumption is the biggest strength of ML-based approaches, it is crucial to develop robust models that can generalise well across different scenarios and vehicle types.

The complexity and variability inherent in forecasting energy consumption for electric vehicles make it a relevant testing ground for cutting-edge modelling techniques that promise to handle diverse and dynamic data inputs. In particular, Multi-Task Learning (MTL) presents a compelling solution by enabling simultaneous training across multiple related tasks, thereby leveraging shared information to improve the predictive performance of each task. In contrast, training machine learning models in a traditional setting only utilize the target task. MCL is particularly beneficial in scenarios with limited training data, as it enhances generalisation by incorporating auxiliary tasks that provide additional training signals. Moreover, the efficacy of MTL can be further amplified by integrating curriculum learning (CL), which structures the learning process in a progressive manner. Curriculum learning organises tasks from simple to complex, allowing the model to build a robust foundation before tackling more challenging problems. By combining these methodologies into multi-task curriculum learning (MCL), we can efficiently train neural networks that not only perform better on individual tasks but also generalise more effectively across different contexts. MCL optimises the learning trajectory, ensuring that simpler tasks enhance the model's capability to learn more complex ones, ultimately leading to more accurate and reliable energy consumption forecasts for electric heavy-duty vehicles. This integrative approach has been shown to be a potent strategy to address the multifaceted challenges in several domains but has not been applied to EV auxiliary energy forecasting before. Thus, this paper aims to evaluate the suitability of MCL in this real-world, complex scenario. Generating a set of auxiliary tasks is a critical step in the implementation of MCL -and how to do it for forecasting energy consumption in EVs requires experimental evaluation. To create auxiliary tasks, one must first obtain an understanding of the primary task, identifying key factors and variables that influence energy consumption and the types of patterns that are indicative of future behaviour. These factors often include vehicle load, driving speed, route characteristics, weather conditions, and driver behaviour. Each of these variables can serve as the basis for an auxiliary task. For instance, an auxiliary task might involve predicting the impact of vehicle load on energy consumption under different traffic conditions or estimating the effect of varying driving speeds on battery usage. Historical data from real-world vehicle operations can be mined to extract relevant patterns and correlations, which can then be used to define these auxiliary tasks. In this paper, we have decided to focus on the patterns within the forecasted value itself instead of exploiting multivariate vehicle signals. In particular, we define several types of energy consumption characteristics as targets for the auxiliary tasks, such as questioning whether the consumption in the next time frame exceeds the global mean, whether the consumption will be higher in the next time frame compared to the current consumption, or predicting the consumption difference between the start and the end of the next time frame. These tasks are general enough to be suitable for any forecasting task, while at the same time being sufficiently closely related to the actual primary task to, hopefully, provide useful information to boost the training process.

The core contribution of this paper is the evaluation of applying several multi-task curriculum learning techniques for forecasting the energy consumption of heavy-duty electric vehicles, including the proposition of utilising key consumption characteristics as targets for generating auxiliary tasks for MCL. Comparison of MCL variations, with combinations of curriculum learning strategy (sequential learning and progressive continual learning) and auxiliary tasks, illustrates the improvements in the performance on a real-world data collected from normal operations of commercial transportation electric vehicles. The experimental results show progressive continual learning, with a logistic growth weighting function governing the learning balance between the primary and the auxiliary task, achieves the best performance; the result also shows that the first auxiliary task is the most helpful task for subsystems 1 and 4; the third auxiliary task is the most helpful task for subsystems 2 and 3. Furthermore, it is observed that MCL with the proposed auxiliary tasks can improve the learning efficiency of the model, achieving faster convergence to a point beyond which the gain from further training is limited.

Related Work

Curriculum learning enables the training of machine learning models in a meaning order, from easy samples to sets of difficult and complex samples [1]. A Common approach for CL introduces easy-to-hard ordering of samples for the training process, e.g., vanilla CL, self-paced CL, balanced CL, etc. When multiple tasks are available, the easy-to-hard ordering of the tasks to be learned can be applied as well. Multi-task learning can be applied, by sharing information across a set of related tasks in the training process, and the performance can be further improved [2] via, e.g. Gradnorm [3] balancing the losses between multiple tasks. While most multi-task learning approaches aim at learning multi-tasks simultaneously, progressive curriculum learning allows determining the best order to learn multiple tasks to maximise the final result. Study [4] presented by Pentina et al. finds the best order of tasks to be learned in a sequence based on a generalisation bound criterion to optimise the average expected classification performance over all the tasks. Work [5] by Siahpour et al. introduced a penalty coefficient, as a function of the epoch step, to govern the training process by suppressing the loss, and noise respectively, from the domain discrimination task in the early stage, to ensure the efficient training of neural networks. Shi et al. proposed progressive contrastive learning [6] based on multi-prototypes in the dataset, the training process is ordered to learn the centroid prototype first, followed by the hard prototype, and finally the dynamic prototype. In this work, we explore sequential learning and progressive continual learning with a set of auxiliary tasks generated based on key characteristics of target signal.

Problem Formulation

For a given primary learning task 𝒯 𝑖 , we create a set of auxiliary tasks 𝒯 𝑗 𝑖 , where 𝒯 𝑖 corresponds to the primary task (in our case, the forecasting of energy consumption for the 𝑖-th auxiliary subsystem in an electric truck), and 𝒯 𝑗 𝑖 corresponds to the 𝑗-th type of auxiliary task. The majority of the multi-task learning studies aim to learn all relevant tasks together to improve the performance for each task 𝒯 𝑖 . In our study, we are only interested in improving the energy forecasting tasks 𝒯 𝑖 , not the generated auxiliary tasks 𝒯 𝑗 𝑖 . All energy forecasting tasks and the auxiliary tasks are learned from the same dataset, multi-variate time series sensor readings were collected from the normal operations of several heavy-duty electric vehicles.

Let us denote data of the multivariate time series x of each vehicle 𝑣 by 𝑋 = { 𝑥 𝑘 𝑣,𝑡 | 𝑡 = 1, 2, ..., 𝑇 𝑒 (𝑣), 𝑘 = 1, 2, ..., 𝐾}, where 𝑥 𝑘 𝑣,𝑡 is the value of the 𝑘-th feature x given a vehicle/trajectory 𝑣 at time 𝑡, and 𝑇 𝑒 (𝑣) corresponds to the end of the recording. A subset of the features 𝑢 𝑖 𝑣,𝑡 reflects the energy consumption of subsystem 𝑖 at time 𝑡. The target energy consumption 𝑦 𝑖 𝑣,𝑡 0 in a future time frame 𝜏 𝑝ℎ can be approximated by summing up the energy consumed over this time frame 𝑦 𝑖 𝑣,𝑡 0 = ∑︀ 𝑡∈[𝑡 0 ,𝑡 0 +𝜏 𝑝ℎ ] 𝑝 𝑖 (𝑡) • ∆𝑡, where 𝑝 𝑖 (𝑡) is the power consumption at time 𝑡, and ∆𝑡 is the time interval between two samples.

In this study, we set 𝜏 𝑝ℎ equal to 10 minutes. For a given forecasting task 𝒯 𝑖 , a regression model 𝑓 𝑗 𝑖 (•) is trained together with one of the auxiliary tasks 𝒯 𝑗 𝑖 to estimate consumption 𝑦 𝑖 𝑣,𝑡 . In this study, neural networks, a shared feature extractor, with multiple heads, each corresponding to one task, were trained under different settings and evaluated for their performance after 200 training epochs. We explore different multi-task curriculum learning settings and auxiliary tasks for forecasting energy consumption. The MCL methods were compared to the traditional approach

Method

Auxiliary Tasks

For a given regression task 𝒯 𝑖 (forecasting energy consumption for one of the subsystems), a set of auxiliary tasks was generated to assist the learning progress. We explore the use of five types of consumption characteristics as targets for creating the auxiliary tasks: i) 𝒯 1 𝑖 : classifying whether the consumption in the next time frame exceeds the global mean for that subsystem 𝑖; ii) 𝒯 2 𝑖 : classifying whether the consumption will increase in the next time frame, compared with the current consumption; iii) 𝒯 3 𝑖 : classifying whether the consumption at the end of the next time frame exceeds the starting point; iv) 𝒯 4 𝑖 : predicting the consumption difference between the start and the end of the next time frame; v) 𝒯 5 𝑖 predicting the difference between the peak consumption and the lowest consumption in the next time frame. The first three auxiliary tasks are classification task, the other two tasks are regression task. Learning to predict these key consumption characteristics in these auxiliary tasks 𝒯 𝑗 𝑖 , along with the primary tasks 𝒯 𝑖 , under MCL, are evaluated for their usefulness.

Network Architecture

The regression model evaluated for MCL in this study builds on a multi-layer perceptron. The model is comprised of a shared feature extractor and two heads, one head carries out the main task 𝒯 𝑖 , and the other corresponds to one of the five auxiliary tasks 𝒯 𝑗 𝑖 . The network architecture is illustrated in Figure 1. For auxiliary tasks that are classification tasks, a sigmoid function was applied to the output of the corresponding head.

Curriculum Learning Strategy

The two curriculum learning strategies evaluated in this work are sequential learning (SeqL) and progressive continual learning (PCL). The overall optimisation loss ℒ can be defined as:

ℒ = 𝜆ℒ 𝒯 𝑖 + (1 − 𝜆)ℒ 𝒯 𝑗 𝑖 (1)

where ℒ 𝒯 𝑖 denotes the loss for the primary tasks, while ℒ 𝒯 𝑗 𝑖 denotes the loss for the auxiliary task 𝑗. The SeqL employed imposes a fixed ordering of the tasks, e.g. learning the auxiliary task first, before a predetermined epoch, and the primary task afterwards:

𝜆 𝑆𝑒𝑞𝐿 = {︃ 0 if 𝜂 < 𝒩 𝑒𝑝 1 if 𝜂 ≥ 𝒩 𝑒𝑝 (2)

where 𝜂 is the current training epochs, and 𝒩 𝑒𝑝 is the number of epochs predetermined to switch to another task. The PCL employs a weighting mechanism, a function of training epochs, to govern the learning process and gradually increases the weights on the loss corresponding to the primary task:

𝜆 𝑃 𝐶𝐿 = 2 1 + 𝑒𝑥𝑝(−10𝛼𝜂/𝒩 𝑡𝑜𝑡 ) − 1 (3)

where 𝛼 is a coefficient governing the change rate (see Figure 2 for an illustration), 𝜂 is the current training epochs, and 𝒩 𝑡𝑜𝑡 is the total amount of training epochs. The two curriculum learning strategies were compared with MTL without any special curriculum learning and learning only on the primary task 𝒯 𝑖 The two evaluation criteria in this study are (i) the test loss (Mean Absolute Error, MAE) after training converged, i.e., 𝑁 𝑡𝑜𝑡 epochs, and (2) whether the proposed learning strategy achieves a faster convergence time, i.e., the epoch at which the test loss has reached a saturation point (no further significant decrease in the loss afterwards). In each case, different variants of MCL are compared against a learning process without any multi-task curriculum learning. The saturation point is detected using a knee point detection algorithm [7], proposed by Satopaa et al.

Experiment Result

The energy consumption dataset was collected from several electric trucks operating in different countries for a couple of months, including sensor readings of mileage, speed, ambient temperature, and energy consumed for auxiliary subsystems, etc., from sessions of driving.

The four subsystems we forecast energy consumption for are the air compressor 𝒯 1 , the air conditioner 𝒯 2 , the cabin heater 𝒯 3 , and the heater of the energy storage system 𝒯 4 .

For the experiment conducted in this paper, the neural networks were implemented via Pytorch library [8], using an ADAM optimiser with a learning rate of 0.001. The loss function for the regression tasks is mean absolute error (MSE), and binary cross-entropy (BCE) was employed as the loss function for the classification tasks. The total number of training epochs 𝒩 𝑡𝑜𝑡 is set to 200. For the sequential learning strategy, 𝒩 𝑒𝑝 is set to 100, and for the progressive continual learning, 𝛼 value of 0.1 (i.e. a linear function) and 0.3 are tested. The experiments were conducted using 4-fold cross-validation driving session-wise, i.e. data from the same driving session would never appear in the training and the testing population together.

Table 1 and Table 2 show the training and testing losses after 200 epochs of training of the neural networks using multi-task learning with any curriculum learning (MTL), sequential learning (SeqL), progressive continual learning with an 𝛼 of 0.1 (PCL-lin), and an 𝛼 of 0.3 (PCL-exp). The baseline performance, single task learning (STL), is produced with learning only on the primary task 𝒯 𝑖 for each subsystem, shown in the parenthesis. It is shown in both tables that the lowest averaged MSE is achieved using PCL-exp. As a sanity check, Table 1 demonstrated that the training losses, after 200 epochs of training, of most MCL methods did converge to a level comparable to STL. For the testing losses shown in Table 2, applying PCL-exp on task sets {𝒯 1 , 𝒯 1 1 } and {𝒯 4 , 𝒯 1 4 } achieved lowest averaged MSE for forecasting energy consumption of subsystem 1 and 4 (i.e., the first auxiliary task appears to be the most helpful auxiliary task for subsystem 1 and 4); similarly, applying PCL-exp on task sets {𝒯 2 , 𝒯 3 2 } and {𝒯 3 , 𝒯 3 3 } achieved lowest averaged MSE for forecasting energy consumption of subsystem 2 and 3.

Figure 3 illustrates the differences between several multi-task curriculum learning strategies, focusing on the convergence speed. Specifically, we identify a reference point (epoch) beyond which the gain from further training is limited. This reference point is computed using a knee point detection method (algorithm [7] by Satopaa et al.) on the mean STL test losses (shown as grey dots and the corresponding dash line). The four plots in Figure . 3 illustrate the testing loss for learning the four primary tasks, along with their 5-th auxiliary task, i.e. 𝒯 5 𝑖 . It is observed in Figure 3: i) there is no significant difference between the four approaches for 𝒯 1 ; ii) MTL and PCL-lin drop slightly slower compared to STL and PCL-exp for 𝒯 2 ; iii) both PCL approach drop slower compared with STL and MTL for 𝒯 3 ; iv) MTL, PCL-lin, and PCL-exp drops faster compared to STL. Table 3 shows a comparison between MCL methods on the convergence time to the reference point (computed based on STL mean testing losses over the four folds). It is observed that: i) MTL outperforms STL in all four primary tasks, and converged to the reference point faster than other approaches in three out of four primary tasks; ii) the performance of PCL-lin achieved converged fast for two of the tasks; iii) PCL-exp achieved better performance compared to PCL-lin, with overall short convergence time. The result corresponding to SeqL is particularly interesting. Although a 𝒩 𝑒𝑝 of 100 epochs is adopted for SeqL (i.e. trained on one of the auxiliary tasks for the first 100 epochs before learning the primary task), the testing loss converges to the reference point within 10 epochs in the majority of the cases. From an empirical perspective, the proposed auxiliary tasks assisted the learning (of the models) for the primary task, resulting in a faster convergence time to the reference point.

Figure 1 :1Figure 1: MLP network architecture

Figure 2 :2Figure 2: PCL weighting function

Figure 3 :3Figure 3: Comparison of convergence speed for different MCL approaches and auxiliary tasks.

Table 11Comparison of training loss after the 200 epochs for different MCL approaches using different auxiliary tasks. The reference performances (using STL) are placed in parentheses. MCL results outperforming the baseline are highlighted in bold, and the best performance for each subsystem is underlined.Task1 (0.6202 ± 0.0163)MTLSeqLPCL-linPCL-expAuxTask10.6125 ± 0.0141 0.6054 ± 0.0167 0.609 ± 0.0185 0.6013 ± 0.0156AuxTask20.6374 ± 0.01470.6502 ± 0.01380.6802 ± 0.00960.6435 ± 0.0156AuxTask30.6165 ± 0.0166 0.6131 ± 0.0118 0.6076 ± 0.0154 0.6033 ± 0.0155AuxTask40.6239 ± 0.01320.626 ± 0.01190.6567 ± 0.0161 0.6182 ± 0.0121AuxTask50.6256 ± 0.0113 0.6152 ± 0.01210.625 ± 0.01470.614 ± 0.0117Task2 (0.2617 ± 0.0245)MTLSeqLPCL-linPCL-expAuxTask10.2681 ± 0.0230.276 ± 0.01280.2959 ± 0.0157 0.2541 ± 0.0171AuxTask20.2619 ± 0.01580.3016 ± 0.03080.2939 ± 0.02040.2475 ± 0.037AuxTask30.2662 ± 0.03950.2862 ± 0.0324 0.2379 ± 0.0245 0.2158 ± 0.0186AuxTask40.2534 ± 0.0110.2866 ± 0.02550.2795 ± 0.0202 0.2366 ± 0.0432AuxTask50.2638 ± 0.01680.2691 ± 0.03610.2971 ± 0.0167 0.2436 ± 0.0285Task3 (0.3173 ± 0.0115)MTLSeqLPCL-linPCL-expAuxTask10.3223 ± 0.0132 0.3138 ± 0.00840.3248 ± 0.0120.3116 ± 0.0111AuxTask20.3217 ± 0.01090.3222 ± 0.01160.3423 ± 0.01130.317 ± 0.0096AuxTask30.3148 ± 0.0074 0.3117 ± 0.0151 0.3229 ± 0.0121 0.3018 ± 0.0116AuxTask40.333 ± 0.01260.3272 ± 0.01370.356 ± 0.02140.3188 ± 0.0152AuxTask50.3213 ± 0.01030.3188 ± 0.01430.3412 ± 0.0091 0.3171 ± 0.0122Task4 (0.2936 ± 0.0129)MTLSeqLPCL-linPCL-expAuxTask10.2903 ± 0.0186 0.3248 ± 0.0159 0.2684 ± 0.0208 0.2646 ± 0.0124AuxTask20.2941 ± 0.01230.3511 ± 0.01710.3565 ± 0.0866 0.2583 ± 0.0156AuxTask30.2979 ± 0.01360.3064 ± 0.0165 0.2624 ± 0.00810.344 ± 0.152AuxTask40.3269 ± 0.01270.3712 ± 0.01170.425 ± 0.0350.329 ± 0.0275AuxTask50.3145 ± 0.01420.3142 ± 0.03110.3334 ± 0.00760.3036 ± 0.0182

Table 22Comparison of test loss after the 200 epochs for different MCL approaches using different auxiliary tasks. The reference performances (using STL) are placed in parentheses. MCL results outperforming the baseline are highlighted in bold, and the best performance for each subsystem is underlined.Task1 (0.6861 ± 0.0713)MTLSeqLPCL-linPCL-expAuxTask10.6829 ± 0.07070.692 ± 0.0720.6827 ± 0.0727 0.6784 ± 0.0672AuxTask20.6948 ± 0.07360.7037 ± 0.07490.7248 ± 0.07020.7076 ± 0.0709AuxTask30.6812 ± 0.0744 0.6969 ± 0.06340.698 ± 0.07620.6943 ± 0.0774AuxTask40.6917 ± 0.07750.6934 ± 0.0690.7058 ± 0.06340.6881 ± 0.0684AuxTask50.6968 ± 0.07120.6894 ± 0.07350.6913 ± 0.0740.6862 ± 0.073Task2 (0.4374 ± 0.1821)MTLSeqLPCL-linPCL-expAuxTask10.4553 ± 0.2035 0.4199 ± 0.1608 0.4277 ± 0.1788 0.4285 ± 0.177AuxTask20.4322 ± 0.1671 0.4448 ± 0.1766 0.4319 ± 0.1676 0.4329 ± 0.1678AuxTask30.4109 ± 0.1556 0.4427 ± 0.1782 0.4105 ± 0.1398 0.3929 ± 0.1602AuxTask40.4105 ± 0.1632 0.4436 ± 0.16730.4699 ± 0.1928 0.4362 ± 0.1752AuxTask50.4702 ± 0.20930.4734 ± 0.21580.4874 ± 0.23440.4632 ± 0.2307Task3 (0.3827 ± 0.0551)MTLSeqLPCL-linPCL-expAuxTask10.3777 ± 0.0593 0.377 ± 0.06270.3829 ± 0.0609 0.3774 ± 0.0577AuxTask20.387 ± 0.05030.3901 ± 0.06180.401 ± 0.05340.3892 ± 0.0659AuxTask30.3859 ± 0.05340.3857 ± 0.05780.3868 ± 0.0588 0.3766 ± 0.0684AuxTask40.3847 ± 0.06830.3828 ± 0.05630.4021 ± 0.07240.3865 ± 0.0602AuxTask50.3874 ± 0.05870.3868 ± 0.06050.4016 ± 0.05630.38 ± 0.0549Task4 (0.4166 ± 0.0679)MTLSeqLPCL-linPCL-expAuxTask10.4155 ± 0.0650.4783 ± 0.06980.4332 ± 0.0801 0.3986 ± 0.0567AuxTask20.4085 ± 0.0745 0.4786 ± 0.08530.5225 ± 0.12170.4339 ± 0.0813AuxTask30.424 ± 0.04920.5029 ± 0.05540.4253 ± 0.0540.4874 ± 0.1068AuxTask40.4251 ± 0.07740.4548 ± 0.07670.4936 ± 0.07180.4375 ± 0.0664AuxTask50.4291 ± 0.06130.4442 ± 0.06620.4535 ± 0.06510.4347 ± 0.0716

Table 33Comparison of convergence speed to reach a point beyond which the gain from further training is limited. The reference point is given by STL loss sequences averaged over 4 folds. MCL results outperforming the baseline are highlighted in bold, and the best performance for each subsystem is underlined.Task1 (20 Ep.)MTLSeqLPCL-linPCL-expAuxTask113.0 ± 1.4142102.25 ± 0.43316.75 ± 2.277616.25 ± 8.9268AuxTask232.5 ± 4.0311125.75 ± 2.2776103.5 ± 28.814145.75 ± 26.6962AuxTask315.75 ± 3.8971106.5 ± 1.11829.0 ± 3.391222.25 ± 8.1968AuxTask410.5 ± 1.5106.5 ± 2.291370.5 ± 32.760529.0 ± 15.1493AuxTask515.25 ± 6.7961102.5 ± 0.518.0 ± 4.301219.0 ± 9.083Task2 (19 Ep.)MTLSeqLPCL-linPCL-expAuxTask115.625 ± 7.9047 58.875 ± 50.582734.0 ± 22.721120.125 ± 14.4606AuxTask218.0 ± 9.2736108.25 ± 9.807544.5 ± 14.23923.5 ± 11.4127AuxTask326.0 ± 16.4773106.75 ± 7.980442.5 ± 12.031222.75 ± 7.1545AuxTask432.0 ± 33.2039111.25 ± 13.3112 60.3333 ± 21.638432.0 ± 18.8149AuxTask517.0 ± 11.2916108.5 ± 8.6168100.0 ± 86.014528.0 ± 24.8697Task3 (14 Ep.)MTLSeqLPCL-linPCL-exp

Acknowledgments

The work was carried out with support from the Knowledge Foundation and Vinnova (Sweden's innovation agency) through the Vehicle Strategic Research and Innovation Programme FFI.

MTL SeqL PCL-lin PCL Task4 -exp AuxTask1 14 1651 19 <idno>7178 18.875 ± 8.1 17.1429 ± 4.4538</idno> </analytic> <monogr> <title level="j">AuxTask3 14 75 ± 2 9896 61 for forecasting the energy consumption of auxiliary subsystems in heavy-duty electric vehicles. The preliminary results show that progressive continual learning has achieved the best performance (lowest averaged MSE) compared to multi-task learning with any CL, Sequential CL, and the traditional approach ) enabling CL across primary tasks, based on task relevancy Curriculum learning: A survey PSoviany RTIonescu PRota NSebe International Journal of Computer Vision 130 2022 A survey on multi-task learning YZhang QYang IEEE Transactions on Knowledge and Data Engineering 34 2021 Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks ZChen VBadrinarayanan C.-YLee ARabinovich International conference on machine learning PMLR 2018 Curriculum learning of multiple tasks APentina VSharmanska CHLampert Proceedings of the IEEE conference on computer vision and pattern recognition the IEEE conference on computer vision and pattern recognition 2015 A novel transfer learning approach in remaining useful life prediction for incomplete dataset SSiahpour XLi JLee IEEE Transactions on Instrumentation and Measurement 71 2022 JShi XYin YWang XLiu YXie YQu arXiv:2402.19026 Progressive contrastive learning with multi-prototype for unsupervised visible-infrared person re-identification 2024 arXiv preprint Finding a" kneedle" in a haystack: Detecting knee points in system behavior VSatopaa JAlbrecht DIrwin BRaghavan 2011 31st international conference on distributed computing systems workshops IEEE 2011 APaszke SGross SChintala GChanan EYang ZDevito ZLin ADesmaison LAntiga ALerer Automatic differentiation in pytorch 2017