<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Optimising Hierarchical Demand Forecasting with Explainable AI: Insights into Key Drivers</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mátyás</forename><surname>Kuti-Kreszács</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Babeş-Bolyai University</orgName>
								<address>
									<settlement>Cluj-Napoca</settlement>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Optimising Hierarchical Demand Forecasting with Explainable AI: Insights into Key Drivers</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">70FAD0E93F19F659D0ABABB0282DC37C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:27+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Demand forecasting is a prediction problem that aims to estimate future needs based on historical data. It serves as the basis for optimal decision making in multiple areas of value chains such as manufacturing, logistics, and retail. It is particularly important in demand forecasting models where demand drivers like price, promotions, and resource planning can help companies optimise pricing, promotional activities, resource planning, and inventory planning. Our goal is to identify applicable feature importance techniques to hierarchical forecasting problems by providing insights into feature importance and the underlying decision-making process and helping to understand the model's reasoning. We propose applying SHAP values to a forecasting model while using part of a real-world dataset. The results will provide insight into the key drivers of the forecast and help to understand the impact of the features on the decisions made by the model.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Demand forecasting became really important for businesses and serves as the basis for optimal decision making in multiple areas in value chains such as manufacturing, logistics, and retail. By having multiple products, manufacturing locations, sales channels, and geographical regions, demand forecasting can be complex and hierarchical in nature.</p><p>However, the problem can be formulated as a regression problem, with the aim of predicting the future demand based on historical data. This regression problem can be solved using machine learning models such as random forests, gradient boosting, and neural networks. Unfortunately, these models are considered black boxes and their predictions are hard to interpret. This is where explainable AI (XAI) techniques come into play, providing insights into the model's decision-making process and helping to understand the underlying rules and reasoning behind the predictions.</p><p>One of the most fundamental methods for understanding a model's reasoning is feature importance or attribution, which allows identifying key contributor factors to the model's predictions. This is especially important in demand forecasting models, where demand drivers, such as price, promotions, weather, holidays, and economic indicators, can influence demand. Understanding these drivers can help companies optimise pricing, promotional activities, resource planning, and inventory management.</p><p>Our goal is to identify applicable feature importance techniques to demand forecasting models, aiming to discover key features contributing to the decisions and explain the model's reasoning at different levels. The significance of our research is to improve the reasoning and transparency of multiseries and hierarchical demand forecasting models by providing insights into feature importance and the underlying rules at various levels. The methods employed are expected to be used not only in demand forecasting, but also in other grouped and hierarchical forecasting problems in different domains.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1.">Research Questions</head><p>The gap identified in the literature is the lack of studies that apply feature importance techniques to multiseries models for hierarchical demand forecasting problems and analyse the underlying decision RuleML+RR'24: Companion Proceedings of the 8th International Joint Conference on Rules and Reasoning, September <ref type="bibr" target="#b15">[16]</ref><ref type="bibr" target="#b16">[17]</ref><ref type="bibr" target="#b17">[18]</ref><ref type="bibr" target="#b18">[19]</ref><ref type="bibr" target="#b19">[20]</ref><ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr">2024</ref>, Bucharest, Romania Envelope matyas.kuti@ubbcluj.ro (M. Kuti-Kreszács) GLOBE https://www.linkedin.com/in/kkmatyas/ (M. Kuti-Kreszács) Orcid 0009-0004-4997-2000 (M. Kuti-Kreszács) drivers at different levels. Other studies focused on the representation of the explanation for sales forecasting models, but not on the explanation methods themselves <ref type="bibr" target="#b0">[1]</ref>. Our research questions are:</p><p>• RQ1: Can existing feature importance techniques be applied to multi-series and hierarchical models to identify key features and explain the underlying decision factors? • RQ2: How feature importance can be translated to different hierarchical levels? • RQ3: How do these methods perform when applied to real-world datasets? • RQ4: How can the results be visualised and interpreted? • RQ5: What methods are most effective in this context?</p><p>In our current work, we partially address RQ1 and RQ2 by proposing a method to apply SHAP values to a LightGBM model used for forecasting hierarchical time-series data. Furthermore, we make progress on RQ3 using part of a real-world dataset; however, evaluation is still pending. Last but not least, we address RQ4 by visualising the results in a way that can be interpreted by the user. RQ5 is still open and will be addressed in future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Literature review</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Demand Forecasting with machine learning</head><p>Demand forecasting is a prediction problem that aims to estimate future needs based on historical data. Statistical forecasting methods such as ARIMA <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref> and exponential smoothing <ref type="bibr" target="#b2">[3]</ref> have been widely used in demand forecasting. However, they have limitations in intermittent multi-series and hierarchical forecasting, where machine learning models have shown better performance <ref type="bibr" target="#b3">[4]</ref>. An important aspect also is that there may be multiple exogenous variables so-called demand drivers <ref type="bibr" target="#b4">[5]</ref> that can influence the demand. Internal factors such as price, promotions, and external factors like weather, holidays, and economic indicators can be considered as demand drivers. These can be used as features in machine learning models to improve forecast accuracy.</p><p>Machine learning models such as tree ensembles and neural networks have been successfully applied to demand forecasting tasks <ref type="bibr" target="#b3">[4]</ref>. Ensemble models in general can be homogeneous with individual models of the same type or heterogeneous with models of different types. We considered only homogeneous ensemble tree models because of the applicability of some model-specific explanation methods. To build tree ensembles, bagging methods such as random forest <ref type="bibr" target="#b5">[6]</ref> can be used, which trains multiple decision trees on different subsets of the data, and the final prediction is the average of the predictions of the individual models. In addition, boosting methods such as Gradient Boosting Machines (GBM) <ref type="bibr" target="#b6">[7]</ref>, XGBoost <ref type="bibr" target="#b7">[8]</ref>, and LightGBM <ref type="bibr" target="#b8">[9]</ref>, which train models sequentially on the residuals of the previous model, in this case using the sum of individual predictions. In a notable forecasting competition <ref type="bibr" target="#b9">[10]</ref>, a LightGBM model was the winner and secured four of the top five positions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Forecasting techniques</head><p>Forecasting techniques can be divided into single-series or multi-series forecasting from the perspective of the model's input. Single-series forecasting refers to the prediction of a single time series, while multiseries forecasting involves the prediction of multiple time series, with the same global model <ref type="bibr" target="#b10">[11]</ref>. These series can be related to each other, such as sales of different products, or they can be independent, such as sales in different regions; therefore, it is important to consider the hierarchical structure of the data.</p><p>Hierarchical forecasting refers to the prediction of multiple time series that are related to each other in a hierarchical structure <ref type="bibr" target="#b11">[12]</ref>. It can be tackled with different single-level approaches, such as bottom-up, top-down, or middle-out <ref type="bibr" target="#b11">[12]</ref>. The top-down approach would involve a single series model for the total demand and then disaggregating it to the lower levels. The middle-out and bottom-up approach would involve a multiseries model. Grouped time-series forecasting is a special case of hierarchical forecasting, where the series are aggregated based on attributes such as product type, region, or sales channel.</p><p>[5] suggests three major hierarchies in demand forecasting: product hierarchy, geographical hierarchy, and time hierarchy. The product hierarchy refers to the categorisation of products according to their attributes, such as product type, brand, or category. The geographic hierarchy involves the division of sales regions based on geographic attributes, such as country, state, or city down to the point of sale. Time hierarchy refers to the temporal structure of the data, such as year, month, week, day, and hour.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Feature importance</head><p>Feature importance (FI) or feature attribution is considered an interpretation method resulting in a summary statistic that assigns a score to each input feature <ref type="bibr" target="#b12">[13]</ref>. Depending on their scope, the FI methods can be global or local <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b12">13]</ref>. The global feature importance (GFI) or model feature attribution methods explain the contribution of features to overall predictions, while the local FI quantifies feature contributions to specific predictions <ref type="bibr" target="#b12">[13]</ref>. Although related, GFI methods differ from feature selection, which identifies irrelevant features before training. GFI methods can be model-specific, which are limited to specific model types, while model-agnostic ones are applicable independent of the model type <ref type="bibr" target="#b12">[13]</ref>. Another categorisation of FI methods is given by how it is calculated, in which case the importance can be based on the model's structure, while the other approach relies on a dataset.</p><p>Among the model-agnostic methods, one of the most common is permutation feature importance (PFI) which was proposed to measure FI in random forests <ref type="bibr" target="#b14">[15]</ref>. It is a model-agnostic, data-dependent method that measures the decrease in the model's performance when the features are permuted. The PFI can be calculated using different metrics such as the mean squared error (MSE), the mean absolute error (MAE), or the coefficient of determination (𝑅 2 ). PFI also has limitations, as it is sensitive to overand underfitting <ref type="bibr" target="#b15">[16]</ref>, in which case the FI differs on training and test data, so the use of both datasets can be beneficial. In addition, another flaw of the PFI method is that it can generate cases in which the model does not have training data <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b17">18]</ref>, but other methods were proposed to overcome this <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b19">20]</ref>.</p><p>SHAP(SHapley Additive exPlanation) <ref type="bibr" target="#b20">[21]</ref> values contribute local explanation for individual predictions, but aggregates of it are useful to assess the importance of global features. For example, the mean absolute SHAP values quantify the importance of the feature regardless of the direction of the impact on the prediction. There are different algorithms for approximation from which Kernel SHAP <ref type="bibr" target="#b20">[21]</ref> is one that is model-agnostic. TShap <ref type="bibr" target="#b21">[22]</ref> is a method for estimating SHAP values for time series data, but it uses a surrogate model, so it gives the FI of the surrogate. Another related method is SAGE (Shapley additive global importance) <ref type="bibr" target="#b18">[19]</ref>, which estimates the contribution of each feature to the model's performance.</p><p>Tree specific GFI methods are gain-based importance values which were already introduced with decision trees <ref type="bibr" target="#b22">[23]</ref> It measures of the reduction in mean average error(MAE) made by the decisions based on the respective feature. Another measure is the split-based importance <ref type="bibr" target="#b7">[8]</ref> refers to the number of decisions made by the model based on a feature. The previously presented SHAP also has a tree model-specific solution for approximation, called TreeSHAP <ref type="bibr" target="#b23">[24]</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Explainability in forecasting</head><p>The number of publications on forecasting explainability is limited. <ref type="bibr" target="#b0">[1]</ref> tackled the presentation of explanations for sales forecasting models, but not the explanation methods themselves. <ref type="bibr" target="#b24">[25]</ref> used SHAP values to explain the prediction of a time series model but on local level and not global level. Skforecast <ref type="bibr" target="#b10">[11]</ref> library extracts model specific global feature importance from tree ensemble models. The work is focused on either global feature importance or local feature contribution without considering the multi-series and hierarchical structure of the data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.">Feature importance as a basis for model reasoning</head><p>Feature importance methods can provide insight into the model's decision-making process and help to understand the underlying rules and reasoning behind the predictions. By including demand drivers as features in the model, the feature importance methods can help to identify the key drivers of demand. For external factors such as weather, holidays, and economic indicators, the importance of the characteristics can help to understand their impact on demand. Through internal factors like price, promotions, the feature importance can help to understand post-promotion effects and the impact of price changes on the demand <ref type="bibr" target="#b4">[5]</ref>. Knowing the influence of internal factors can help to optimize pricing strategies and promotional activities. However, causation and correlation are different concepts, and the feature importance methods can only provide correlation; therefore, the identified key features should be further analyzed to understand the causation <ref type="bibr" target="#b14">[15]</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head><p>Our preliminary research focusses on the methodological aspects of applying feature importance techniques to hierarchical forecasting models. This includes adaptation of existing methods, but also tool development to support the analysis of hierarchical forecasting models. Later we plan to conduct an empirical study to evaluate the methods on real-world datasets.</p><p>Our initial research design1 includes the following steps:</p><p>• Data collection: identify datasets with hierarchical time series data describing sales/demand for multiple product categories and regions with exogenous variables. • Tool evaluation: assess the applicability of existing libraries for hierarchical forecasting and XAI techniques. • Model implementation: we build global models that consider multiple series and exogenous variables. • Feature importance analysis: We apply model attribution methods and aggregation and decomposition techniques to identify key features and analyse their impact on the forecast. • Model reasoning: analyse the feature contributions to forecast and identify underlying rules on different levels of the hierarchy. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Data collection and preprocessing</head><p>To model the hierarchical impact of features on forecasting, we must use datasets with multiple series and exogenous variables that represent demand drivers. There are multiple open sales data sets available; however, there are just a few, such as the M5 competition <ref type="bibr" target="#b9">[10]</ref> and the Kaggle datasets <ref type="bibr" target="#b25">[26]</ref>. For our initial exploration, we sampled M5 competition <ref type="bibr" target="#b9">[10]</ref> dataset, which includes sales data for multiple product categories and regions. The dataset contains daily sales information for 3049 products in 10 stores over 5 years. For our analysis, we identified three products that have similar sales patterns and are sold in two states and five stores. As products are from the same category and department, the hierarchy at the product level was not considered. The reason for this filtering is to reduce the complexity of the model and to focus on the feature importance analysis. The selected products are FOODS_3_586, FOODS_3_080, and FOODS_3_555 and are sold in three states of Texas (TX), Wisconsin (WI). The total sales data for these products are shown in Figure <ref type="figure">2</ref>. Our hierarchical structure is shown in Figure <ref type="figure">3</ref>. It should be mentioned that the hierarchical structure can be inverted, meaning that the products can be at the top level and the stores at the bottom level, so technically our data set is grouped time series data. Data preprocessing two main parts: preparing the sales data and the exogenous variables. Sales data were aggregated at the weekly level. The weeks at the beginning and end of the data set were removed to have a consistent time period. As features, lagged sales data was included to capture the temporal dependencies. The exogenous variables were related to pricing and calendar events. The selling price was already aggregated at the weekly level for each store and product. Calendar events included whether a day was a holiday, had special events, and if it was a SNAP (Supplemental Nutrition Assistance Programme) day in a respective state. To include these variables in some way, they were </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 3</head><p>Model hyperparameters search space counted for each week and state. In addition, the week of the year was included as a feature to capture seasonality. The data were split into training and test sets, and the last complete year(2015) was used to test the model. The structure of the data sales data is shown in Table <ref type="table">1</ref> and the exogenous variables in Table <ref type="table">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Model implementation</head><p>The modelling approach is to build a single global on all series and exogenous variables for bottom-up aggregation. For creating forecast models, the skforecast <ref type="bibr" target="#b26">[27]</ref> library was used. The base model for hierarchical forecasting was LightGBM <ref type="bibr" target="#b8">[9]</ref> which we chose because of its efficiency and also because of its widespread usage in the M5 competition in this data set <ref type="bibr" target="#b9">[10]</ref>. Other ensemble models such as Random Forest or Gradient Boosting Machines could be used as well. Other reasons for choosing LightGBM are that it can handle categorical variables without the need for one-hot encoding, and that it supports model-specific split and gain-based global feature importance methods.</p><p>Hyperparameter tuning was performed using the Optuna library <ref type="bibr" target="#b27">[28]</ref>, by Bayesian optimisation. The search space3 was defined for the parameters of the LightGBM model, including the number of predictors, the minimum number of samples in the leaf, and the maximum depth of the tree. In addition, the number of lagged sales records used as features was included in the search space. For the search, the data was split into training and validation sets, the last year being the validation set used for backtesting. The performance of the model was evaluated as a mean square error (MSE) in the validation set for each configuration. The best configuration found was with 239 estimators and a maximum depth of 26 with a backtesting MSE 4263.01 The lagged sales records used as features were 1, 4, 5, 13, and 52 weeks.</p><p>The feature input for the final model is a table with the following columns:</p><p>• week_of_year represented as numerical values (1-52)</p><p>• sell_price for the week for the product in the store • num_of_events for the week • snap_days for the week in the state • lag_n for n in <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b12">13,</ref><ref type="bibr">52]</ref> representing the sales from the previous weeks • series_id noted as (_level_skforecast) encoded as a numerical value representing the series hierarchy</p><p>Series_id could have been encoded as a one-hot encoded vector or as a categorical variable given it is supported by LightGBM. One-hot encoded vector would have increased the number of features and the complexity of the model, while with the categorical variable</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Feature importance analysis and model reasoning</head><p>Two initial ideas were considered to analyse the importance of characteristics. The first involves using the mean SHAP values for cohorts representing different levels of the hierarchy, providing information on the contribution of features throughout the structure. The second approach is based on conditional permutation importance, which evaluates the importance of features while accounting for the hierarchical structure on the idea of subgroup-based permutation importance <ref type="bibr" target="#b28">[29]</ref>. The first method was prioritised for implementation due to the availability of support in the SHAP library <ref type="bibr" target="#b29">[30]</ref>. Given an instance 𝑥 for prediction, the SHAP value of the feature 𝑖 is 𝜙 𝑖 (𝑥) Each 𝑥 is part of a cohort 𝐶 𝑘 based on the series hierarchy 𝑘. The contribution value or importance of feature 𝑖 for a 𝐶 𝑘 cohort is calculated as</p><formula xml:id="formula_0">𝜙 𝑖 (𝐶 𝑘 ) = 1 |𝐶 𝑘 | ∑ 𝑥∈𝐶 𝑘 |𝜙 𝑖 (𝑥)|<label>(1)</label></formula><p>where |𝐶 𝑘 | is the cardinality of 𝐶 𝑘 and |𝜙 𝑖 (𝑥)| is the absolute SHAP value of feature, 𝑖 for instance 𝑥.</p><p>Steps for the feature importance analysis:</p><p>• For each prediction instance 𝑥 and feature 𝑖 calculate SHAP value 𝜙 𝑖 (𝑥).</p><p>• Split the instances into cohorts according to the hierarchy levels.</p><p>• Calculate the mean SHAP values for each cohort 𝐶 • Visualize the mean SHAP values for 𝐶 and summary plots</p><p>The reasoning of the model is based on the analysis of the contributions of the features to the forecast. The aim is to identify the underlying rules and patterns that the model uses to make predictions. The SHAP values provide a way to understand the impact of the features on the forecast. The analysis can be done at different levels of the hierarchy, from the global model to the state and store levels.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Preliminary Results</head><p>The preliminary results focus mainly on the practical application of SHAP values in hierarchical forecasting models rather than on the theoretical aspects of feature importance. As preliminary results, we present mean average SHAP values at different aggregation levels. These provide an overview of the main contributors to the forecast at different levels of the hierarchy.</p><p>Furthermore, we visualise the distribution of SHAP values at different aggregation levels using violin plots, which provide a representation of variability and density of SHAP values for each feature across the hierarchy. This double representation allows for a more detailed analysis of the contribution of features to the forecast, for example, if a feature contributes positively or negatively to the forecast. In the following, we present several cases of different aggregation levels, starting from the global level to the store and product level, but it is not meant to be exhaustive.</p><p>At the global level, the SHAP values4a show that the most important features are the lagged sales values, especially the sales value of lag 1, which has the highest SHAP value. This is expected as prior sales are the most important factor in predicting future sales. The violin plot in Figure <ref type="figure" target="#fig_2">4b</ref> shows the distribution of SHAP values for each feature. It shows that the actual impact of the lag value is most of the time negative, as after a week with higher sales, demand the following week can drop. In case of state-level grouping, the number of samples differs for the two groups, one of them having only two stores included. This can be observed in the wider distribution on the violin plot of the Texas(TX) state.</p><p>On the lowest level of the hierarchy presented in Figure <ref type="figure" target="#fig_7">7</ref>, deviations can be revealed in the order of importance of the features. For example, in Figure <ref type="figure" target="#fig_7">7d</ref> the week-of-year feature has a greater impact on the forecast than some of the lag values that occurred in other cases in Figure <ref type="figure" target="#fig_7">7c</ref>. This can be due to the fact that the store TX_2 has a different seasonality pattern or might have recurring special events, since the number of events is also a feature with higher impact on this store. What is problematic in this case is that, due to the large number of series, representation of the mean absolute SHAP value is hardly comprehensible in the previous form of the bar plot. As a workaround, the grouping of feature contribution of each group is presented in Figure <ref type="figure" target="#fig_7">7b</ref>. What can be misleading in this case is the lack of order by impact and a different scale of the 𝑥 axis for each feature.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Discussion</head><p>In this work, we managed to calculate the Shapley values for the predictions of a hierarchical forecasting model with some limitations, while we also aggregated these values to different levels of the hierarchy. By this we addressed the first two research questions. We used a sample of a real-world dataset to evaluate the proposed method working towards the third research question. We visualised the SHAP values at different levels of the hierarchy and provided some interpretation of the results. To respond to the fourth research question, we plan to expand the literature review to include a wider range of XAI techniques.</p><p>This research is expected to contribute in several key areas. First, it will provide an evaluation of feature importance methods in the context of hierarchical forecasting models. This will help to identify what methods are most effective and how they can be applied to improve model interpretability. Second, it aims to provide guidelines, best practices, and limitations of effectively explaining these models. Finally, the research will support the development of tools that improve the understanding of hierarchical forecasting models and their underlying rules and reasoning.</p><p>The limitations of our study include the handling of categorical variables in the SHAP library. The effect of ordinal encoding that induces an order on the categorical variables may not be appropriate for all cases. Low feature importance for the categorical variables may be due to the encoding method. Recent research <ref type="bibr" target="#b19">[20]</ref> proposes a method to handle categorical variables for conditional feature importance. With regard to data limitations, the dataset used is simplified in multiple dimensions. First, with aggregation of sales data at the weekly level, multiple exogenous variables such as special events could not be included. Second, the dataset is limited to a single product category, which may not be representative of all hierarchical forecast scenarios. Lastly, the input data was limited to the sales lag of the product without considering other products in the same category. The independence of products in the same category may not be a realistic assumption.</p><p>During the implementation of our initial approach, we encountered several challenges. One of the main issues was the lack of appropriate tools. For example, the SHAP library does not support categorical features in the current version. In addition,we faced difficulties in visualising the results; although the SHAP library offers integrated graphing functions, these have not been effectively used to deal with a large number of cohors, leading to errors and incomplete plots. Although these issues are not straightforward to solve, they are a sign of unexplored areas in the field of XAI and hierarchical forecasting.</p><p>There are also potential risks that could impact research in addition to challenges. One of the main risks is the availability of data, especially real-world datasets that include exogenous variables or demand drivers. Synthetic datasets can be used as an alternative, but they may not capture the complexity of real-world scenarios. The evaluation of methods is another potential risk, as it may be difficult to assess the performance of the explanation methods. In case of application grounded evaluation, it may be difficult to find experts in the field who can provide meaningful feedback given that each product category may require different domain knowledge <ref type="bibr" target="#b30">[31]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Future work</head><p>To address the challenges and limitations of the current research, several next steps are proposed for each part of the research.</p><p>• First, the literature review will be extended to include a broader range of XAI techniques. Given  the current context, we focus on feature importance-based evaluation, partial dependence plots, feature interaction, and other XAI techniques that should be included in the review. • Data collection will be expanded to include synthetic datasets and additional real-world datasets.</p><p>In addition, including more data from the actual dataset and forecasting on the day level can be a future direction. • Evaluation and implementation of the tool for other methods will be needed. Conditional permutation importance can be also evaluated after implementing the method. • The model implementation could be extended to include additional ML models for hierarchical</p><p>forecasting. An additional enhancement to that would be dependent multiseries forecasting, as usually product sales are not independent of each other, especially in the same product category. • Rule extraction based on feature importance and interaction can be a future direction.</p><p>• After covering the methodological aspects, an empirical study with evaluation in terms of accuracy and computational efficiency is planned.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Conclusion</head><p>Nowadays every organisation thrives in the direction of becoming data driven. In this context, datadriven decision making is crucial for optimising business processes to remain competitive. This effort is supported by the use of data mining, machine learning, and AI techniques. To avoid blindly trusting  ML models,it is crucial to understand the reasoning behind their decisions. Our goal is to demystify hierarchical forecasting models by applying XAI techniques. This study explores the usage of SHAP values to explain the importance of features in hierarchical forecasting models. Our preliminary results focused on the practical aspects of aggregating SHAP values at different levels of hierarchy. This approach provides insights into the model's reasoning. We plan to extend this work by evaluating other XAI techniques to enhance the explainability of hierarchical forecasting models.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Research design</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Total weekly sales for the chosen products</figDesc><graphic coords="5,117.13,65.61,361.01,203.36" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Global summary</figDesc><graphic coords="8,72.00,122.13,203.07,139.81" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>State WI SHAP values (a) State level SHAP values (b) State level SHAP values (c) TX state product SHAP summary (d) WI state product SHAP summary</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head></head><label></label><figDesc>State and product level SHAP values (c) TX state FOODS_3_586 product SHAP summary (d) WI state FOODS_3_080 product SHAP summary</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: State and product level summary</figDesc><graphic coords="10,72.00,166.85,203.07,277.15" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head></head><label></label><figDesc>Store and product level SHAP values for one feature (c) Store TX_3 FOODS_3_586 product SHAP summary (d) Store TX_2 FOODS_3_080 product SHAP summary</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Store and product level summary</figDesc><graphic coords="11,72.00,351.95,225.63,159.48" type="bitmap" /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgement</head><p>This work was done in collaboration with my Ph.D. supervisor, Laura Diosan, from Babes-Bolyai University. I am grateful for her continued support and encouragement throughout this research.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Explanation interfaces for sales forecasting</title>
		<author>
			<persName><forename type="first">T</forename><surname>Fahse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Blohm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hruby</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Van Giffen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ECIS 2022 Research-in-Progress Papers</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Forecasting of demand using arima model</title>
		<author>
			<persName><forename type="first">J</forename><surname>Fattah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ezzine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Aman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">E</forename><surname>Moussami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lachhab</surname></persName>
		</author>
		<idno type="DOI">10.1177/1847979018808673</idno>
		<idno>arXiv:</idno>
		<ptr target="https://doi.org/10.1177/1847979018808673" />
	</analytic>
	<monogr>
		<title level="j">International Journal of Engineering Business Management</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page">1847979018808673</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Demand forecasting: Literature review on various methodologies</title>
		<author>
			<persName><forename type="first">C</forename><surname>Ingle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bakliwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Chhajed</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), IEEE</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1" to="7" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Comparison of statistical and machine learning methods for daily sku demand forecasting</title>
		<author>
			<persName><forename type="first">E</forename><surname>Spiliotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Makridakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-A</forename><surname>Semenoglou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Assimakopoulos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Operational Research</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="page" from="3037" to="3061" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Vandeput</surname></persName>
		</author>
		<ptr target="https://books.google.ro/books?id=C_u8EAAAQBAJ" />
		<title level="m">Demand forecasting best practices</title>
				<imprint>
			<publisher>Manning</publisher>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Random Forests, Machine-mediated learning</title>
		<author>
			<persName><forename type="first">L</forename><surname>Leo Breiman</surname></persName>
		</author>
		<author>
			<persName><surname>Breiman</surname></persName>
		</author>
		<idno type="DOI">10.1023/a:1010933404324</idno>
		<idno>mAG ID: 2911964244 S2ID: 8e0be569ea77b8cb29bb0e8b031887630fe7a96c</idno>
		<imprint>
			<date type="published" when="2001">2001</date>
			<biblScope unit="volume">45</biblScope>
			<biblScope unit="page" from="5" to="32" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Greedy function approximation: A gradient boosting machine</title>
		<author>
			<persName><forename type="first">Jerome</forename><forename type="middle">H</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jerome</forename><forename type="middle">H</forename><surname>Friedman</surname></persName>
		</author>
		<idno type="DOI">10.1214/aos/1013203451</idno>
		<idno>mAG ID: 1678356000 S2ID: 1679beddda3a183714d380e944fe6bf586c083cd</idno>
	</analytic>
	<monogr>
		<title level="j">Annals of Statistics</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="page" from="1189" to="1232" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">XGBoost: A Scalable Tree Boosting System</title>
		<author>
			<persName><forename type="first">T</forename><surname>Tianqi Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Carlos</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Guestrin</surname></persName>
		</author>
		<author>
			<persName><surname>Guestrin</surname></persName>
		</author>
		<idno type="DOI">10.1145/2939672.2939785</idno>
		<idno>mAG ID: 3102476541</idno>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A Highly Efficient Gradient Boosting Decision Tree</title>
		<author>
			<persName><forename type="first">Guolin</forename><surname>Ke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Guolin</forename><surname>Ke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Qi Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Taifeng</forename><surname>Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wei</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wei</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wei</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Weidong</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tie-Yan</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T.-Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><surname>Liu</surname></persName>
		</author>
		<idno>MAG ID: 2753094203 S2ID: 497e4b08279d69513e4d2313a7fd9a55dfb73273</idno>
	</analytic>
	<monogr>
		<title level="m">Neural Information Processing Systems</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="3108" to="3116" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">M5 accuracy competition: Results, findings, and conclusions</title>
		<author>
			<persName><forename type="first">S</forename><surname>Makridakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Spiliotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Assimakopoulos</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.ijforecast.2021.11.013</idno>
		<ptr target="https://doi.org/10.1016/j.ijforecast.2021.11.013" />
	</analytic>
	<monogr>
		<title level="j">International Journal of Forecasting</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page" from="1346" to="1364" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Rodrigo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E</forename><surname>Ortiz</surname></persName>
		</author>
		<ptr target="https://skforecast.org/0.12.1/user_guides/dependent-multi-series-multivariate-forecasting.html" />
		<title level="m">Global forecasting models: Dependent multi-series forecasting (multivariate forecasting</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>Hyndman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Athanasopoulos</surname></persName>
		</author>
		<ptr target="https://otexts.com/fpp3/index.html" />
		<title level="m">Forecasting: principles and practice</title>
				<meeting><address><addrLine>OTexts</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>3 ed</note>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Molnar</surname></persName>
		</author>
		<ptr target="https://christophm.github.io/interpretable-ml-book" />
		<title level="m">Interpretable Machine Learning</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note>2 ed</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A survey of methods for explaining black box models</title>
		<author>
			<persName><forename type="first">R</forename><surname>Guidotti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Monreale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ruggieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Turini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Giannotti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Pedreschi</surname></persName>
		</author>
		<idno type="DOI">10.1145/3236009</idno>
		<ptr target="https://doi.org/10.1145/3236009.doi:10.1145/3236009" />
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Random forests</title>
		<author>
			<persName><forename type="first">L</forename><surname>Breiman</surname></persName>
		</author>
		<idno type="DOI">10.1023/A:1010933404324</idno>
		<ptr target="https://doi.org/10.1023/A:1010933404324" />
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">45</biblScope>
			<biblScope unit="page" from="5" to="32" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Limitations of interpretable machine learning methods</title>
		<author>
			<persName><forename type="first">C</forename><surname>Molnar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gruber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kopper</surname></persName>
		</author>
		<ptr target="mAGID:3041627266" />
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Molnar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Molnar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>König</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>König</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Herbinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Herbinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Freiesleben</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Freiesleben</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dandl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Scholbeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Scholbeck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Casalicchio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Casalicchio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Grosse-Wentrup</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Grosse-Wentrup</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bischl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bischl</surname></persName>
		</author>
		<idno>doi:null</idno>
		<title level="m">Pitfalls to avoid when interpreting machine learning models</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance</title>
		<author>
			<persName><forename type="first">Giles</forename><surname>Hooker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hooker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Giles</forename><surname>Hooker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lucas</forename><surname>Mentch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mentch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Siyu</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhou</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11222-021-10057-z</idno>
	</analytic>
	<monogr>
		<title level="j">Statistics and Computing</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Understanding Global Feature Contributions With Additive Importance Measures</title>
		<author>
			<persName><forename type="first">Ian</forename><surname>Covert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Covert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Scott</forename><surname>Lundberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lundberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Su-In</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-I</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="17212" to="17223" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Conditional feature importance for mixed data</title>
		<author>
			<persName><forename type="first">Kristin</forename><surname>Blesch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">S</forename><surname>Watson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marvin</forename><forename type="middle">N</forename><surname>Wright</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10182-023-00477-9</idno>
	</analytic>
	<monogr>
		<title level="j">AStA Advances in Statistical Analysis</title>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">A unified approach to interpreting model predictions</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Lundberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-I</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS&apos;17</title>
				<meeting>the 31st International Conference on Neural Information Processing Systems, NIPS&apos;17<address><addrLine>Red Hook, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Curran Associates Inc</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="4768" to="4777" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Vaculín, TsSHAP: Robust model agnostic feature-based explainability for time series forecasting</title>
		<author>
			<persName><forename type="first">C</forename><surname>Vikas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Arindam</forename><surname>Raykar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sumanta</forename><surname>Jati</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nupur</forename><surname>Mukherjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kanthi</forename><forename type="middle">K</forename><surname>Aggarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Giridhar</forename><surname>Sarpatwar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Roman</forename><surname>Ganapavarapu</surname></persName>
		</author>
		<idno type="DOI">10.48550/arxiv.2303.12316</idno>
		<idno>aRXIV_ID: 2303.12316</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Stone, classification and regression trees</title>
		<author>
			<persName><forename type="first">A</forename><surname>Gordon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Breiman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jerome</forename><forename type="middle">H</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Olshen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Charles</surname></persName>
		</author>
		<idno type="DOI">10.2307/2530946</idno>
	</analytic>
	<monogr>
		<title level="j">Biometrics</title>
		<imprint>
			<date type="published" when="1984">1984</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">From Local Explanations to Global Understanding with Explainable AI for Trees</title>
		<author>
			<persName><forename type="first">Scott</forename><surname>Lundberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lundberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gabriel Erion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hugh</forename><surname>Erion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alex</forename><forename type="middle">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Degrave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jordan</forename><forename type="middle">M</forename><surname>Degrave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Prutkin</surname></persName>
		</author>
		<author>
			<persName><surname>Prutkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Bala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">G</forename><surname>Nair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ronit</forename><surname>Nair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Katz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jonathan</forename><surname>Katz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Himmelfarb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">T</forename><surname>Himmelfarb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nisha</forename><surname>Flynn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Bansal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Su-In</forename><surname>Bansal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-I</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><surname>Lee</surname></persName>
		</author>
		<idno type="DOI">10.1038/s42256-019-0138-9</idno>
	</analytic>
	<monogr>
		<title level="j">Nature Machine Intelligence</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="56" to="67" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">Towards a rigorous evaluation of explainability for multivariate time series</title>
		<author>
			<persName><forename type="first">R</forename><surname>Saluja</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Malhi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Knapič</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Främling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Cavdar</surname></persName>
		</author>
		<idno type="DOI">10.2139/ssrn.4627337</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">M</forename><surname>Corporación Favorita Inversion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Julia</forename><surname>Elliott</surname></persName>
		</author>
		<ptr target="https://kaggle.com/competitions/favorita-grocery-sales-forecasting" />
		<title level="m">Corporación favorita grocery sales forecasting</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">J</forename><surname>Rodrigo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">Escobar</forename><surname>Ortiz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Skforecast</forename></persName>
		</author>
		<idno type="DOI">10.5281/zenodo.8382788</idno>
		<ptr target="https://skforecast.org/.doi:10.5281/zenodo.8382788" />
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Optuna: A next-generation hyperparameter optimization framework</title>
		<author>
			<persName><forename type="first">T</forename><surname>Akiba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yanase</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ohta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Koyama</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</title>
				<meeting>the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Model-agnostic feature importance and effects with dependent features -a conditional subgroup approach</title>
		<author>
			<persName><forename type="first">C</forename><surname>Molnar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Molnar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>König</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>König</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bischl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bischl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Casalicchio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Casalicchio</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10618-022-00901-9</idno>
	</analytic>
	<monogr>
		<title level="j">Data mining and knowledge discovery</title>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<title level="m" type="main">Consistent Individualized Feature Attribution for Tree Ensembles</title>
		<author>
			<persName><forename type="first">Scott</forename><surname>Lundberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lundberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gabriel Erion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Su-In</forename><surname>Erion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-I</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><surname>Lee</surname></persName>
		</author>
		<idno>arXiv:</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
			<publisher>Learning</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Considerations for evaluation and generalization in interpretable machine learning</title>
		<author>
			<persName><forename type="first">F</forename><surname>Doshi-Velez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Kim</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-98131-4_1</idno>
	</analytic>
	<monogr>
		<title level="m">The Springer Series on Challenges in Machine Learning</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="3" to="17" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
