<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Explainable Federated Learning by Incremental Decision Trees ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Parisa</forename><surname>Jamshidi</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Center for Applied Intelligent Systems Research (CAISR)</orgName>
								<orgName type="institution">Halmstad University</orgName>
								<address>
									<settlement>Halmstad</settlement>
									<country key="SE">Sweden</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sławomir</forename><surname>Nowaczyk</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Center for Applied Intelligent Systems Research (CAISR)</orgName>
								<orgName type="institution">Halmstad University</orgName>
								<address>
									<settlement>Halmstad</settlement>
									<country key="SE">Sweden</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mahmoud</forename><surname>Rahat</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Center for Applied Intelligent Systems Research (CAISR)</orgName>
								<orgName type="institution">Halmstad University</orgName>
								<address>
									<settlement>Halmstad</settlement>
									<country key="SE">Sweden</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Zahra</forename><surname>Taghiyarrenani</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Center for Applied Intelligent Systems Research (CAISR)</orgName>
								<orgName type="institution">Halmstad University</orgName>
								<address>
									<settlement>Halmstad</settlement>
									<country key="SE">Sweden</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Explainable Federated Learning by Incremental Decision Trees ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">28399D01A22EDC67D9418DB4E5D42B03</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:54+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>eXplainable AI (XAI)</term>
					<term>Federated Learning</term>
					<term>Incremental Decision Tree</term>
					<term>Extremely Fast Decision Tree</term>
					<term>Data Stream</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Explainable Artificial Intelligence (XAI) is crucial in ensuring transparency, accountability, and trust in machine learning models, especially in applications involving high-stakes decision-making. This paper focuses on addressing the research gap in federated learning (FL), specifically emphasizing the use of inherently interpretable underlying models. While most FL frameworks rely on complex, black-box models such as Artificial Neural Networks (ANNs), we propose using Decision Tree (DT) classifiers to maintain explainability. More specifically, we introduce a novel framework for horizontal federated learning using Extremely Fast Decision Trees (EFDTs) with streaming data on the client side. Our approach involves aggregating clients' EFDTs on the server side without centralizing raw data, and the training process occurs on the clients' sides. We outline three aggregation strategies and demonstrate that our methods outperform local models and achieve performance levels close to centralized models while retaining inherent explainability.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Explainable Artificial Intelligence (XAI) is essential in machine learning and artificial intelligence as it improves transparency, accountability, trust, and the ability to enhance and debug AI systems. As AI technologies expand to various sectors, from healthcare to finance, the complexity and opacity of these models often prevent their deployment in critical decision-making processes. XAI seeks to mitigate this issue by providing insights into how AI algorithms reason and how they arrive at specific conclusions, enabling users to validate and comprehend model behaviors. Most of XAI research today focuses on post-hoc explanations for black-box models. These models have complex internal workings that are difficult for humans to interpret or understand. Various forms of analysis, from surrogate models to gradient credit assignment, are often used to explain these models and their decisions. However, these post-hoc explanations have several drawbacks. For example, they often fail to capture the true inner workings of the model, TempXAI@ECML-PKDD'24: Explainable AI for Time Series and Data Streams Tutorial-Workshop, Sep. 9 th , 2024, Vilnius, Lithunia Envelope parisa.jamshidi@hh.se (P. Jamshidi); slawomir.nowaczyk@hh.se (S. Nowaczyk); mahmoud.rahat@hh.se (M. Rahat); zahra.taghiyarrenani@hh.se (Z. Taghiyarrenani) Orcid 0000-0001-7055-2706 (P. Jamshidi); 0000-0002-7796-5201 (S. Nowaczyk); 0000-0003-2590-6661 (M. Rahat); 0000-0002-1759-8593 (Z. Taghiyarrenani)</p><p>providing only surface-level insights that might not be trustworthy <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>. Furthermore, they are vulnerable to adversarial attacks that can manipulate the explanations without changing the model's predictions <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4]</ref>. Such limitations make them a questionable choice in applications where transparency is a priority. Conversely, white-box models are inherently explainable models designed to be transparent and interpretable from the outset. These models allow users to understand their decision-making processes without the need for additional interpretability techniques <ref type="bibr" target="#b0">[1]</ref>.</p><p>Federated learning (FL) is a technique in machine learning in which multiple devices record local data and share their model with a server. The server forms a global model and shares it without exchanging raw data. This method is more efficient regarding bandwidth and computing resources because it does not need to transfer large amounts of raw data to a central server.</p><p>Although this approach has gained much attention due to its capacity, most of the underlying models for FL are Artificial Neural Networks (ANNs). Post-hoc explainability methods can be applied to black-box models of FL. For example, <ref type="bibr" target="#b4">[5]</ref> used feature importance methods to add explainability to their models, while <ref type="bibr" target="#b5">[6]</ref> presented counterfactual explanation techniques. However, there is a surprising lack of FL techniques suitable for inherently interpretable models.</p><p>In this paper, we focus on Decision Tree (DT) classifiers. Shallow DTs, in contrast to NNs and complex models, are designed in a way that their internal workings are easily understandable to humans. This allows for straightforward tracing of how inputs are transformed into outputs, facilitating immediate insight into the decision-making process. Such transparency is crucial for validating the model's logic, ensuring ethical compliance, and fostering trust among users. This makes DTs particularly suitable for the identification and correction of biases and errors within the model: their transparent nature enables users to pinpoint specific aspects of the model that contribute to undesirable outcomes, facilitating targeted improvements. Moreover, some research indicates that tree-based models outperform ANNs in some applications, specifically tabular data <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8]</ref>.</p><p>The typical approach in ML is batch learning, also known as offline learning, which involves training a model on a fixed dataset all at once. On the other hand, online learning, or incremental learning, continuously updates the model as new data becomes available. This makes it suitable for dynamic environments where data arrives sequentially. Much less research has been done with incremental learning in FL fashion than batch learning.</p><p>Therefore, there is a gap when clients have access to streaming data and aim to achieve an inherently explainable global model with better performance than their local model through collaboration in a federated mode. We proposed our method to aggregate the incremental decision trees from clients' sides in an FL framework, which can achieve higher performance compared to local ones, and as the client's models are DTs, they are inherently explainable. Our contributions are as follows:</p><p>• We created a global model without centralizing raw data, using statistical information stored in EFDTs. • In our proposed method, EFDTs are trained locally and aggregated in each round. • We have introduced three aggregation methods on the server to aggregate clients' EFDTs.</p><p>• Furthermore, we maintain both local and global models as a single tree to make them inherently explainable.</p><p>The rest of the paper is organized as follows: Related work summarizes previous research intersection of FL and tree-based methods. Then, in Preliminaries, we introduce some basic concepts of incremental decision trees and tree similarity methods. The proposed method is described in Methodology. The experimental setup is found in Section Experimental Setup followed by the result in Results. Finally, we draw some conclusions in Conclusion</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head><p>Within this distributed learning paradigm, horizontal federated learning (HFL) and vertical federated learning (VFL) represent two approaches. Each client has different data samples in HFL, but the feature set is the same. VFL, however, uses data samples shared among clients with different features <ref type="bibr" target="#b8">[9]</ref>. SecureBoost <ref type="bibr" target="#b9">[10]</ref> is a VFL framework in which only one of the clients has the label of samples, called active party, and the rest, without labels, called passive parties. This method works with a gradient-tree boosting algorithm in which the active party shares the gradient and Hessian values with the passive party. After that, the passive party categorizes the samples based on local features into buckets and provides the total values of each bucket to the active party in the form of a histogram. The active party then performs local calculations to find the best way to split the nodes and directly coordinate updates with the passive parties. Other methods, like OpBoost <ref type="bibr" target="#b10">[11]</ref>, were proposed to optimize SecureBoost. There are other methods categorized on tree-based VFL, including FEDXGB <ref type="bibr" target="#b11">[12]</ref> and VF2Boost <ref type="bibr" target="#b12">[13]</ref>, which are Homomor Encryption (HE)-based.</p><p>There is also research on the conjugation of tree-based models and HFL. <ref type="bibr" target="#b13">[14]</ref> proposed a method based on GBDT(Gradient-boosted decision trees), in which each client trains a decision tree using their local dataset and adds it to the global model in turn. Boosting-based Federated Random Forest (BOFRF) <ref type="bibr" target="#b14">[15]</ref> proposed a boosting framework for random forests in a federated manner. Each client forms their RFs and, on the server side, considers those as weak classifiers and proposes a method to calculate their weights. Our approach involves explicitly training a single tree, which is explainable, whereas using the bagging approach in the server to form the global model will result in a loss of explainability.</p><p>Another approach is to train a tree-based model collaboratively in a server. Clients share local statistical information with the server, and the server decides on node partitioning in each round.</p><p>There are few works of conjugation of incremental decision trees and FL. <ref type="bibr" target="#b15">[16]</ref> and <ref type="bibr" target="#b16">[17]</ref> are two methods to train an IDT in VFL and HFL, respectively. The first method encrypts samples from clients' streaming data and passes them to the server. In the latter method, the Data Collector collected the raw data from the client and aggregated them. In both, the server uses that information to train a VFDT.</p><p>Our proposed method falls under the category of HFL, and the features are consistent across all clients. Additionally, we do not share raw data with the server. Instead, we train trees on the client side, and the server is responsible for aggregating the trees.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Preliminaries</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Incremental Decision tree</head><p>The Hoeffding Tree (HT) <ref type="bibr" target="#b17">[18]</ref> is one of the fundamental research studies that proposed the Incremental Decision Tree (IDT) for efficient data stream mining. The HT uses the Hoeffding bound to determine when a statistically significant decision about splitting a node can be made based on the amount of data collected. The inequality for a random variable 𝑟 with range 𝑅 is expressed as follows:</p><formula xml:id="formula_0">𝜖 = √ 𝑅 2 ln (1/𝛿) 2𝑛 ,<label>(1)</label></formula><p>where 𝜖 represents the margin of error, 𝛿 is the confidence level, and 𝑛 is the number of observations. In 2018, <ref type="bibr" target="#b18">[19]</ref> proposed the Hoeffding Anytime Tree, also known as the Extremely Fast Decision Tree (EFDT). EFDT, in contrast, enhances the capabilities of HT by including an "anytime" aspect. This means it can continuously update and revise its model as new data becomes available without revisiting old data. While HT makes decisions based on accumulated data up to a certain point, EFDT continuously improves its decisions, resulting in better performance in dynamic data streams.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Decision Tree Similarity</head><p>An important step in Federated Learning is an aggregation of models from different clients, and a crucial component in that process is based on the similarity between them. Decision tree similarity methods aim to quantify the similarity between different decision trees. Over the years, several approaches have been proposed, typically divided into two main categories: syntactic and semantic similarities.</p><p>Syntactic similarity methods compare and analyze decision trees based on their structure. Techniques such as tree edit distance, which calculates the minimal changes needed to transform one tree into another, and structural similarity measures, which compare the arrangement and features of nodes, are commonly used <ref type="bibr" target="#b19">[20,</ref><ref type="bibr" target="#b20">21,</ref><ref type="bibr" target="#b21">22]</ref>.</p><p>Semantic similarity methods aim to capture the functional similarity between trees by considering the decisions made by each tree, which means they look at the similarity in the prediction distributions <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b21">22]</ref>. This can consider the similarity of decisions made by each tree on a set of instances.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Methodology</head><p>Our methodology consists of two phases: training and aggregation. Training occurs on the clients' sides, while aggregation occurs on the server's side. Our training process is incremental since we are in the streaming setting. Our main contribution resides in the aggregation phase, in which we put different aggregation strategies into contrast. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Problem Formulation</head><p>Assume we have 𝑘 clients 𝑐 1 ⋯ 𝑐 𝑘 where each one has access to a different streaming dataset 𝐷 1 ⋯ 𝐷 𝑘 . Given that we are in a horizontal federated setting, the feature set is the same for all datasets 𝑓 1 ⋯ 𝑓 𝑚 .</p><p>In particular, in order to maintain the inherent interpretability of the global model, we require that the final result be limited to a single decision tree. While aggregating a number of trees into a forest is a natural way to extend DT learning to a federated setting, our work is one of the very few attempts to collaboratively create a single tree among all the clients.</p><p>In our proposed method, the server only aggregates the clients' trees without training them, so there's no need to share raw data with the server.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Incremental training on the client side</head><p>Each client(𝑐 𝑖 ) trains its own decision tree on its local dataset(𝐷 𝑖 ). As a new data point arrives, they use incremental learning to update their tree. After a predefined number of data points are received by each client, they send their trained decision trees to the server. Then, each one will receive a new tree from the server and continue using and training this new tree by the next part of their streaming data.</p><p>Each node of EFDT stores a 𝑛 𝑖𝑗𝑘 table, which can be interpreted as the number of instances where the 𝑖 − 𝑡ℎ attribute has the 𝑗 − 𝑡ℎ value, and the class label is 𝑘. This table is used to calculate metrics such as information gain or Gini index, which determine the best attribute for splitting the data at a node. Each client will share their tree with this information instead of raw data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Aggregation on the server side</head><p>The aggregation process begins when the server receives decision trees from the clients. We have proposed three different strategies for the server side. However, the aggregation process remains consistent across all strategies, but each strategy differs in its approach to selecting the base tree. Before delving into the specifics of each strategy, let's first explore how the aggregation process operates.</p><p>As mentioned earlier, every node in EFDT contains an 𝑛 𝑖𝑗𝑘 table. As part of the aggregation process, we merge the tables of nodes along the same path into two trees. Algorithm 1 outlines this method in detail. Specifically, when two nodes from two trees share common parents, they can be effectively combined, even if they have opted for different features. This is possible because the table contains aggregated information before the split. traverse(𝑇 1 .root, 𝑇 2 .root) 16: end function As mentioned earlier, the strategies differ in terms of how the base tree is chosen to combine the clients' models. Below are the details of three proposed strategies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Algorithm 1 Aggregation Function</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.1.">Syntactic strategy (Syn):</head><p>In this approach, the base tree is the most representative syntactic tree among the clients' trees, meaning it is the tree that requires the fewest changes to match the other trees. By using the edit distance on the bracket format of the trees, we can identify this tree, which we refer to as "Syn" in Algorithm 2. Then, the aggregation method is applied to "Syn" along with the other clients' trees. Finally, the server broadcasts the Syn tree to all clients.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.2.">Semantic strategy (Sem):</head><p>The base tree in this strategy is that tree with minimum disagreement with other clients on predictions. In Algorithm 3, the server uses a proxy dataset, 𝐷 𝑠 , to identify this tree, referred to as "Sem". Then, it iterates through all client trees, aggregating information from matching paths in each tree into the Sem tree to update it. Finally, the server broadcasts the Sem tree to all clients.</p><p>Algorithm 2 Syntactic strategy (Syn) </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.3.">Individual strategy (Ind):</head><p>In contrast to previous strategies, this strategy constructs one tree per client. It compares each client's tree with the trees of all other clients and aggregates information from matching paths into the final tree for that client (Algorithm 4). After updating all trees, the server will pass the updated trees to their owners. aggregation(𝐼 𝑛𝑑, 𝐶 𝑖 ) ▷ Each tree would be aggregated with the other trees' common path. end for 10: end for We used centralized learning and local learning to establish upper-bound and lower-bound, respectively, for the mentioned strategies. In centralized learning, all the data exist in a single location, and a single model is trained. In contrast, in local learning, each client trains a model with its own data without interacting with other clients.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experimental Setup</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Dataset Description</head><p>We use two datasets with categorical features, which lets us better control the experiment and analyze the results. Below is a brief description of each.</p><p>Mushroom dataset: This dataset contains 8124 descriptions of hypothetical samples representing 23 species of gilled mushrooms in the Agaricus and Lepiota Family. Each species is classified as edible or poisonous. There are 22 features and 8124 data points <ref type="foot" target="#foot_0">1</ref> .</p><p>Chess (King-Rook vs. King-Pawn): This dataset contains two classes: white-can-win ("won") and white-cannot-win ("nowin"). The classes are mostly balanced at 52% and 48%, respectively. There are 36 features and 3196 data points<ref type="foot" target="#foot_1">2</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Experiments Description</head><p>To share the data across clients, we used stratified splitting so that the ratio of the classes was the same for all clients. Moreover, the amount of data each client receives is the same as other clients. To address the uncertainty, we randomly performed this data-sharing schema 10 times with different seeds, and the result was averaged.</p><p>It's important to note that before dividing the data among clients, 1% of the data is set aside to be stored on the server side. This data, referred to as 𝐷 𝑠 in this paper, is used to identify the representative semantic tree.</p><p>Our model choice is Extremely Fast Decision Tree (EFDT) <ref type="bibr" target="#b18">[19]</ref>, which is the same for all clients. For stable results, we resorted to the River library in Python <ref type="bibr" target="#b23">[24]</ref>. The parameters for EFDT are set to be the default parameters; only the binary_split is set to True. To construct the representative syntactic tree used in SynRoot and SynPath, the APTED library is utilized <ref type="bibr" target="#b19">[20,</ref><ref type="bibr" target="#b20">21]</ref>.</p><p>For the federated learning setting, we consider 4 clients and 200 rounds of communication between the clients and the server. Accordingly, each client's data is divided by 200 and fed in a streaming manner to clients' EFDT. This learning procedure is repeated for each strategy mentioned in 4.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Results</head><p>In this section, we present the results obtained using the proposed method. We compare the result of the proposed method with different strategies against centralized and local learning. We present two tables, Tables <ref type="table" target="#tab_3">1 and 2</ref>, showing the accuracy of five strategies after some communication rounds for the mushroom and chess datasets, respectively. In almost all cases, the performance of the proposed strategies falls between the performance of local and centralized learning, as expected. These strategies benefit from aggregation, allowing them to access more information at each point compared to local learning. However, their access to information is still not as extensive as centralized learning. Then, we look deeper at the results on the mushroom dataset shown in Figure <ref type="figure">2</ref>. We observe that the speed of performance improvement in the three methods presented is initially higher than that of local learning (refer to the dashed box in the lower left part of Figure <ref type="figure">2</ref>). However, this difference decreases after several stages, eventually reaching less than 5% in the last rounds (see the dashed box in the upper right part of Figure <ref type="figure">2</ref>). This suggests that a client with initially limited data cannot effectively compete with aggregated methods. With access to more data, the client may better understand the problem and achieve strong performance independently close to aggregated methods.</p><p>In all the experiments with the chess dataset, there are significant differences between the performance of the proposed methods and local learning performance. If you compare the dashed boxes in Figure <ref type="figure">3</ref>, you'll notice that individual clients cannot achieve the same performance as aggregated methods even after 200 rounds. However, similar to the results of the mushroom dataset, the performance of the aggregated methods improves quickly in the initial rounds in this dataset as well. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion</head><p>This paper introduces a framework for horizontal federated learning using an Extremely Fast Decision Tree (EFDT) as the underlying model and streaming data on the client side. We propose three strategies to aggregate this type of underlying method. With our strategies, you can merge the information on multiple EFDTs at each round, which helps to enhance the performance at the client level. We also insist on having only one tree on the client side because a single (shallow) tree is inherently explainable. We avoid using black-box models due to concerns about using post-hoc explanation methods. We compared these three strategies with local and centralized learning. All three strategies produce similar results, and their performance is higher than local learning and close (yet low) to centralized learning. Experiments show that although, in some cases, the performance of the proposed methods might not differ too much from local learning after some rounds, they always grow very fast in the first rounds.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The framework of the proposed method. At the clients' sides, training takes place. The blue dashed box signifies the selection of the base tree, while the green dashed box represents the combination of the base tree with other trees, all located on the server side. Depending on the chosen strategy, the base tree may consist of one or multiple trees.</figDesc><graphic coords="5,171.54,40.19,164.48,301.10" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Algorithm 4 1 :</head><label>41</label><figDesc>Individual strategy (Ind) Input: List of clients' trees 𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒 = [𝐶 1 , 𝐶 2 , … , 𝐶 𝑘 ] 2: Output: List of individual aggregated trees for each client 3: 𝐼 𝑛𝑑𝑠.copy(𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒) 4: for each 𝐼 𝑛𝑑 in 𝐼 𝑛𝑑𝑠 do 5: for each 𝐶 𝑖 in 𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒 do 6:if 𝐶 𝑖 ≠ 𝐼 𝑛𝑑 then 7:</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: The performance of Syn, Sem, and Ind strategies, along with local and centralized learning of experiments on the mushroom dataset. The x-axis represents the number of communication rounds between clients and servers, and the y-axis represents the accuracy.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>1 :</head><label>1</label><figDesc>function aggregation(𝑇 1 , 𝑇 2 )</figDesc><table><row><cell>2:</cell><cell>function traverse(𝑛𝑜𝑑𝑒1, 𝑛𝑜𝑑𝑒2)</cell></row><row><cell>3:</cell><cell>if 𝑛𝑜𝑑𝑒1 is None or 𝑛𝑜𝑑𝑒2 is None then</cell></row><row><cell>4:</cell><cell>exit</cell></row><row><cell>5:</cell><cell>else</cell></row><row><cell>6:</cell><cell>merge(𝑛𝑜𝑑𝑒1.𝑛 𝑖𝑗𝑘 , 𝑛𝑜𝑑𝑒2.𝑛 𝑖𝑗𝑘 )</cell></row><row><cell>7:</cell><cell>end if</cell></row><row><cell>8:</cell><cell>if 𝑛𝑜𝑑𝑒1.feature == 𝑛𝑜𝑑𝑒2.feature and 𝑛𝑜𝑑𝑒1.value == 𝑛𝑜𝑑𝑒2.value then</cell></row><row><cell>9:</cell><cell>traverse(𝑛𝑜𝑑𝑒1.right_child, 𝑛𝑜𝑑𝑒2.right_child)</cell></row><row><cell>10:</cell><cell>traverse(𝑛𝑜𝑑𝑒1.left_child, 𝑛𝑜𝑑𝑒2.left_child)</cell></row><row><cell>11:</cell><cell>else</cell></row><row><cell>12:</cell><cell>exit</cell></row><row><cell>13:</cell><cell>end if</cell></row><row><cell>14:</cell><cell>end function</cell></row><row><cell>15:</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>1 :</head><label>1</label><figDesc>Input: List of clients' trees 𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒 = [𝐶 1 , 𝐶 2 , … , 𝐶 𝑘 ] List of clients' trees 𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒 = [𝐶 1 , 𝐶 2 , … , 𝐶 𝑘 ] 2: dataset located on server: 𝐷 𝑠 3: Output: one global tree 4: 𝑃𝑟𝑒𝑑 ← ∅ ▷ To store 𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒 predictions of 𝐷 𝑠 5: for each 𝐶 𝑖 in 𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒 do</figDesc><table><row><cell></cell><cell></cell><cell></cell><cell></cell><cell>Algorithm 3 Semantic strategy (Sem)</cell></row><row><cell>2:</cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="3">3: Output: one global tree</cell><cell></cell></row><row><cell cols="2">4: 𝐵𝐹 𝑠 ← ∅</cell><cell cols="2">▷ To store bracket format of</cell></row><row><cell></cell><cell>𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒</cell><cell></cell><cell></cell></row><row><cell cols="3">5: for each 𝐶 𝑖 in clientsTree do</cell><cell></cell></row><row><cell>6:</cell><cell cols="2">𝐵𝐹 𝑠.add(BracketFormat(𝐶 𝑖 ))</cell><cell></cell><cell>6:</cell><cell>𝑃𝑟𝑒𝑑.add(prediction(𝐶 𝑖 , 𝐷 𝑠 ))</cell></row><row><cell cols="2">7: end for</cell><cell></cell><cell></cell><cell>7: end for</cell></row><row><cell cols="3">8: 𝑆𝑦𝑛 ← 𝑆𝑒𝑙𝑒𝑐𝑡𝐵𝑎𝑠𝑒𝑇 𝑟𝑒𝑒(𝐵𝐹 𝑠)</cell><cell>▷</cell><cell>8: 𝑆𝑒𝑚 ← 𝑆𝑒𝑙𝑒𝑐𝑡𝐵𝑎𝑠𝑒𝑇 𝑟𝑒𝑒(𝑃𝑟𝑒𝑑)</cell><cell>▷</cell></row><row><cell></cell><cell cols="2">Representative Syntactic tree</cell><cell></cell><cell>Representative Semantic tree</cell></row><row><cell cols="3">9: for each 𝐶 𝑖 in clientsTree do</cell><cell></cell><cell>9: for each 𝐶 𝑖 in 𝑐𝑙𝑖𝑒𝑛𝑡𝑠𝑇 𝑟𝑒𝑒 do</cell></row><row><cell>10:</cell><cell cols="2">if 𝐶 𝑖 ≠ 𝑆𝑦𝑛 then</cell><cell></cell><cell>10:</cell><cell>if 𝐶 𝑖 ≠ 𝑆𝑒𝑚 then</cell></row><row><cell>11:</cell><cell cols="2">aggregation(𝑆𝑦𝑛, 𝐶 𝑖 )</cell><cell></cell><cell>11:</cell><cell>aggregation(𝑆𝑒𝑚, 𝐶 𝑖 )</cell></row><row><cell>12:</cell><cell>end if</cell><cell></cell><cell></cell><cell>12:</cell><cell>end if</cell></row><row><cell cols="2">13: end for</cell><cell></cell><cell></cell><cell>13: end for</cell></row><row><cell cols="2">14: return 𝑆𝑦𝑛</cell><cell></cell><cell></cell><cell>14: return 𝑆𝑒𝑚</cell></row></table><note>1: Input:</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 1</head><label>1</label><figDesc>Mean ± std of the accuracy(%) of 10 runs for different methods in some rounds(Mushroom dataset)</figDesc><table><row><cell cols="2">#round Centralized</cell><cell>Local</cell><cell>Ind</cell><cell>Syn</cell><cell>Sem</cell></row><row><cell>1</cell><cell cols="5">48.97 ± 9.13 44.44 ± 11.11 44.44 ± 11.11 44.44 ± 11.11 44.44 ± 11.11</cell></row><row><cell>5</cell><cell>70.15 ± 4.39</cell><cell>53.83 ± 2.54</cell><cell>64.29 ± 5.22</cell><cell>65.51 ± 6.25</cell><cell>64.23 ± 6.99</cell></row><row><cell>15</cell><cell>82.50 ± 1.81</cell><cell>66.31 ± 2.02</cell><cell>79.60 ± 1.75</cell><cell>79.90 ± 1.99</cell><cell>79.55 ± 2.37</cell></row><row><cell>25</cell><cell>86.38 ± 1.51</cell><cell>72.81 ± 1.63</cell><cell>83.53 ± 1.27</cell><cell>83.79 ± 1.20</cell><cell>83.45 ± 1.38</cell></row><row><cell>50</cell><cell>89.65 ± 1.05</cell><cell>80.58 ± 0.82</cell><cell>87.52 ± 0.84</cell><cell>87.74 ± 0.65</cell><cell>87.54 ± 0.87</cell></row><row><cell>100</cell><cell>92.92 ± 0.45</cell><cell>85.97 ± 0.48</cell><cell>90.66 ± 0.58</cell><cell>90.80 ± 0.52</cell><cell>90.64 ± 0.54</cell></row><row><cell>150</cell><cell>94.86 ± 0.48</cell><cell>88.27 ± 0.41</cell><cell>92.37 ± 0.54</cell><cell>92.44 ± 0.61</cell><cell>92.20 ± 0.54</cell></row><row><cell>175</cell><cell>95.51 ± 0.44</cell><cell>88.95 ± 0.46</cell><cell>92.96 ± 0.50</cell><cell>93.06 ± 0.56</cell><cell>92.80 ± 0.51</cell></row><row><cell>185</cell><cell>95.73 ± 0.42</cell><cell>89.22 ± 0.41</cell><cell>93.20 ± 0.54</cell><cell>93.31 ± 0.60</cell><cell>93.06 ± 0.54</cell></row><row><cell>195</cell><cell>95.92 ± 0.41</cell><cell>89.50 ± 0.39</cell><cell>93.42 ± 0.53</cell><cell>93.55 ± 0.58</cell><cell>93.30 ± 0.55</cell></row><row><cell>200</cell><cell>96.01 ± 0.40</cell><cell>89.62 ± 0.38</cell><cell>93.52 ± 0.54</cell><cell>93.66 ± 0.59</cell><cell>93.42 ± 0.56</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 2</head><label>2</label><figDesc>Mean + std of the accuracy(%) of 10 runs for different methods in some rounds(Chess dataset)</figDesc><table><row><cell cols="2">#round Centralized</cell><cell>Local</cell><cell>Ind</cell><cell>Syn</cell><cell>Sem</cell></row><row><cell>1</cell><cell cols="5">48.18 ± 14.11 45.00 ± 18.71 45.00 ± 18.71 45.00 ± 18.71 45.00 ± 18.71</cell></row><row><cell>5</cell><cell>57.12 ± 4.22</cell><cell>45.54 ± 6.83</cell><cell>54.64 ± 4.87</cell><cell>56.25 ± 6.20</cell><cell>53.57 ± 4.72</cell></row><row><cell>15</cell><cell>63.30 ± 2.62</cell><cell>51.59 ± 4.33</cell><cell>61.08 ± 2.99</cell><cell>61.88 ± 3.51</cell><cell>59.55 ± 3.93</cell></row><row><cell>25</cell><cell>66.96 ± 2.44</cell><cell>56.66 ± 2.56</cell><cell>64.56 ± 3.23</cell><cell>65.74 ± 3.02</cell><cell>63.18 ± 4.06</cell></row><row><cell>50</cell><cell>69.30 ± 1.92</cell><cell>61.12 ± 1.23</cell><cell>66.33 ± 2.05</cell><cell>66.68 ± 2.13</cell><cell>65.72 ± 2.59</cell></row><row><cell>100</cell><cell>77.11 ± 2.31</cell><cell>65.49 ± 1.22</cell><cell>72.63 ± 2.55</cell><cell>72.92 ± 2.78</cell><cell>72.27 ± 2.92</cell></row><row><cell>150</cell><cell>81.83 ± 2.01</cell><cell>66.99 ± 1.38</cell><cell>77.34 ± 2.95</cell><cell>77.80 ± 3.03</cell><cell>77.31 ± 3.06</cell></row><row><cell>175</cell><cell>83.46 ± 1.87</cell><cell>67.54 ± 1.08</cell><cell>78.96 ± 3.00</cell><cell>79.49 ± 2.90</cell><cell>79.07 ± 2.91</cell></row><row><cell>185</cell><cell>83.97 ± 1.80</cell><cell>67.82 ± 1.01</cell><cell>79.39 ± 2.93</cell><cell>79.88 ± 2.88</cell><cell>79.49 ± 2.79</cell></row><row><cell>195</cell><cell>84.49 ± 1.72</cell><cell>68.14 ± 1.01</cell><cell>79.94 ± 2.87</cell><cell>80.51 ± 2.80</cell><cell>80.07 ± 2.68</cell></row><row><cell>200</cell><cell>84.65 ± 1.62</cell><cell>68.30 ± 0.94</cell><cell>80.14 ± 2.78</cell><cell>80.71 ± 2.71</cell><cell>80.27 ± 2.58</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://archive.ics.uci.edu/dataset/73/mushroom</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://archive.ics.uci.edu/dataset/22/chess+king+rook+vs+king+pawn</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>The work was carried out with support from The Knowledge Foundation and from Vinnova (Sweden's innovation agency) through the Vehicle Strategic Research and Innovation Programme, FFI.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</title>
		<author>
			<persName><forename type="first">C</forename><surname>Rudin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nature machine intelligence</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="206" to="215" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">The (un) reliability of saliency methods, Explainable AI: Interpreting, explaining and visualizing deep learning</title>
		<author>
			<persName><forename type="first">P.-J</forename><surname>Kindermans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hooker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Adebayo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Alber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">T</forename><surname>Schütt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dähne</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Erhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Kim</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="267" to="280" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Alvarez-Melis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">S</forename><surname>Jaakkola</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1806.08049</idno>
		<title level="m">On the robustness of interpretability methods</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Fooling lime and shap: Adversarial attacks on post hoc explanation methods</title>
		<author>
			<persName><forename type="first">D</forename><surname>Slack</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hilgard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Jia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lakkaraju</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society</title>
				<meeting>the AAAI/ACM Conference on AI, Ethics, and Society</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="180" to="186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">G</forename><surname>Wang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1905.04519</idno>
		<title level="m">Interpret federated learning with shapley values</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Evfl: An explainable vertical federated learning for data-oriented artificial intelligence systems</title>
		<author>
			<persName><forename type="first">P</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">C</forename><surname>Hung</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Systems Architecture</title>
		<imprint>
			<biblScope unit="volume">126</biblScope>
			<biblScope unit="page">102474</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Mind the data, measuring the performance gap between tree ensembles and deep learning on tabular data</title>
		<author>
			<persName><forename type="first">A</forename><surname>Karlsson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Nowaczyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Pashami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Asadi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Symposium on Intelligent Data Analysis</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="65" to="76" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Why do tree-based models still outperform deep learning on typical tabular data?</title>
		<author>
			<persName><forename type="first">L</forename><surname>Grinsztajn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Oyallon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Varoquaux</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="507" to="520" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Decision tree-based federated learning: A survey</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Gai</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Blockchains</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="40" to="60" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Secureboost: A lossless federated learning framework</title>
		<author>
			<persName><forename type="first">K</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Fan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Papadopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Yang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Intelligent Systems</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="page" from="87" to="98" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">X</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Hong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Qin</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2210.01318</idno>
		<title level="m">Opboost: a vertical federated tree boosting framework based on order-preserving desensitization</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Boosting privately: Federated extreme gradient boosting for mobile crowdsensing</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Nepal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">H</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Ren</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 40th international conference on distributed computing systems (ICDCS), IEEE</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page" from="1" to="11" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Vf2boost: Very fast vertical federated gradient boosting for cross-enterprise learning</title>
		<author>
			<persName><forename type="first">F</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Shao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Cui</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 International Conference on Management of Data</title>
				<meeting>the 2021 International Conference on Management of Data</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="563" to="576" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Inprivate digging: Enabling tree-based distributed data mining with differential privacy</title>
		<author>
			<persName><forename type="first">L</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE INFOCOM 2018-IEEE Conference on Computer Communications</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="2087" to="2095" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Bofrf: A novel boosting-based federated random forest algorithm on horizontally partitioned data</title>
		<author>
			<persName><forename type="first">M</forename><surname>Gencturk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Sinaci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">K</forename><surname>Cicekli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="89835" to="89851" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Lightweight privacy-preserving federated incremental decision trees</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Services Computing</title>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Horizontal federating decision tree learning from data streams: Building intelligence in iot edge networks</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Arora</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Thakur</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 8th World Forum on Internet of Things (WF-IoT)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2022">2022. 2022</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Mining high-speed data streams</title>
		<author>
			<persName><forename type="first">P</forename><surname>Domingos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hulten</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining</title>
				<meeting>the sixth ACM SIGKDD international conference on Knowledge discovery and data mining</meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
			<biblScope unit="page" from="71" to="80" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Extremely fast decision tree</title>
		<author>
			<persName><forename type="first">C</forename><surname>Manapragada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">I</forename><surname>Webb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Salehi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</title>
				<meeting>the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="1953" to="1962" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Efficient computation of the tree edit distance</title>
		<author>
			<persName><forename type="first">M</forename><surname>Pawlik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Augsten</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Database Systems (TODS)</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="page" from="1" to="40" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Tree edit distance: Robust and memory-efficient</title>
		<author>
			<persName><forename type="first">M</forename><surname>Pawlik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Augsten</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Systems</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="page" from="157" to="173" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Selecting a representative decision tree from an ensemble of decision-tree models for fast big data classification</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">I</forename><surname>Weinberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Last</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Big Data</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="1" to="17" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">The comparison between classification trees through proximity measures</title>
		<author>
			<persName><forename type="first">R</forename><surname>Miglio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Soffritti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational statistics &amp; data analysis</title>
		<imprint>
			<biblScope unit="volume">45</biblScope>
			<biblScope unit="page" from="577" to="593" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Montiel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Halford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Mastelini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Bolmier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sourty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vaysse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zouitine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">M</forename><surname>Gomes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Read</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Abdessalem</surname></persName>
		</author>
		<title level="m">River: machine learning for streaming data in python</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
