A Simple Heuristic for the Graph Tukey Depth Problem with Potential Applications to Graph Mining Florian Seiffarth1 , Tamás Horváth1,2,3 and Stefan Wrobel1,2,3 1 Dept. of Computer Science, University of Bonn, Bonn, Germany 2 Fraunhofer IAIS, Schloss Birlinghoven, Sankt Augustin, Germany 3 Fraunhofer Center for Machine Learning, Sankt Augustin, Germany Abstract We study a recently introduced adaptation of Tukey depth to graphs and discuss its algorithmic properties and potential applications to mining and learning with graphs. In particular, since it is NP-hard to compute the Tukey depth of a node, as a first contribution we provide a simple heuristic based on maximal closed set separation in graphs and show empirically on different graph datasets that its approximation error is small. Our second contribution is concerned with geodesic core-periphery decompositions of graphs. We show empirically that the geodesic core of a graph consists of those nodes that have a high Tukey depth. This information allows for a parameterized deterministic definition of the geodesic core of a graph. Keywords graph Tukey depth, geodesic closure, closed set separations, geodesic core-periphery decomposition 1. Introduction Centrality measures are of high importance in data analysis, as they typically capture the elements’ “importance” quantitatively. Of course, the meaning of importance depends on the choice of the particular centrality measure. Different types of centrality measures have been introduced for networks (see, e.g., [1]), including degree centrality, eigenvector centrality, Katz centrality, closeness centrality, betweenness centrality, page rank, and hubs and authorities. In Fig. 1 we present a graphical illustration of some of these centrality measures for some small graphs for a visual comparison. Graph Tukey depth is a relatively new centrality measure, introduced in [3]. It is based on the original notion of Tukey depth that is defined over finite subsets of R𝑑 [4, 5]. Informally speaking, the semantics of Tukey depth is as follows: An element 𝑒 has high Tukey depth if it is “hard” to separate it from the rest of the finite ground set using separating hyper-planes only. Conversely, 𝑒 has low Tukey depth if it is “easy” to separate it from the rest of the set. “Hard” resp. “easy” in this context means that the half-space bounded by the separating hyper-plane that contains 𝑒 contains many resp. a few other elements of the finite ground set. Thus, the elements’ “importance” defined by ordinary Tukey depth in R𝑑 relies on the possibility of separating them from other elements. This property directly connects the Tukey depth to all LWDA’22: Lernen, Wissen, Daten, Analysen. October 05–07, 2022, Hildesheim, Germany $ seiffarth@cs.uni-bonn.de (F. Seiffarth); horvath@cs.uni-bonn.de (T. Horváth); wrobel@cs.uni-bonn.de (S. Wrobel) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Degree Cen. Closeness Cen. Betweenness Cen. Tukey Depth MUTAG NCI1 MSRC_9 COIL-DEL small large Figure 1: The Degree Centrality, Closeness Centrality, Betweenness Centrality and Tukey Depth of nodes in graphs selected from different graph datasets [2]. The centrality (resp. depth) values are normalized (i.e, mapped to the interval [0, 1]) by their maximum values in the graph. In particular, nodes of the smallest (resp. highest) centrality values are denoted by yellow (resp. blue). machine learning approaches that are based on hyper-plane separations [6, 7]. For R𝑑 and in other more general metric spaces [8], Tukey depth has been studied in the context of machine learning, in particular, for object classification [9]. In case of learning linear classifiers, it is shown in [10] that the Tukey median, i.e., the points with the highest Tukey depth, are related to the Bayes point. An important advantage of Tukey depth over other centrality measures is that the exact geometric position of the points in R𝑑 is not relevant for their depth. Thus, the Tukey depth is relatively stable regarding outliers and is therefore used for outlier detection [11, 12]. Moreover, as Tukey depth relies on separations, it is possible to adapt it to other domains enabling (abstract) hyper-plane or half-space separation. In particular, this property of the original Tukey depth is utilized in its adaptation to geodesic closure systems over graphs [3]. Similarly to the fact that the computation of the Tukey depth in R𝑑 is NP-hard [13], it is also NP-hard to compute the graph Tukey depth of a node [3]. Motivated by this negative result, one of our main contributions is a heuristic algorithm for approximating graph Tukey depth. It runs in time polynomial in the size of the input graph and approximates the Tukey depth of a node with one-sided error by overestimating it. Our experimental results with small graphs clearly demonstrate that the approximation is close to the exact Tukey depth, by noting that for larger graphs we were not able to evaluate the approximation performance of our algorithm, as it was not possible to calculate the exact Tukey depth in practically feasible time. Our heuristic is based on our greedy algorithm designed in [14] for solving the more general maximal closed set separation (MCSS) problem. In the particular case of graphs, the problem is as follows: Given two subsets 𝐴, 𝐵 of the node set, find, if they exist, two (inclusion) maximal disjoint geodesically closed node sets that separate 𝐴 and 𝐵 from each other. As maximal disjoint closed sets provide a separation of sets, it is, therefore, natural to ask the following question: Is there a connection between graph Tukey depth and node separation with geodesically closed node sets? We give an affirmative answer to this question by showing experimentally on small graph datasets that the solutions of the MCSS problem provide a good approximation quality of the exact Tukey depth. This mainly relies on the following fact: For any closed set containing at least one node of high Tukey depth, there exists no large disjoint closed set. It follows from the definition of graph Tukey depth that it is related to other concepts based on geodesic convexity. One of these notions is the recent probabilistic definition of geodesic core-periphery decomposition of graphs. It was introduced in [15] and studied in [15, 16, 17, 18]. Hence, our second question is concerned with the following problem: Is there a connection between graph Tukey depth and geodesic core-periphery decompositions? The geodesic core- periphery decomposition breaks up some types of graphs (including interaction graphs, such as social networks) into a dense core and a sparse “surrounding” periphery [15] (see Fig. 3 for a visual example). While some graphs (e.g., Erdős-Rényi, Barabási-Albert, and Watts-Strogatz random graphs) seem to have no periphery, others (e.g., trees and fully connected graphs) seem to have no core. Since this behavior is not well-understood up to now, we try to understand it by studying the nodes’ Tukey depth. It seems that the geodesic core of a graph consists of those nodes that are of high Tukey depth (see Fig. 4 for some examples). This observation allows for a parameterized deterministic definition of the cores. That is, the core of a graph can be defined by the set of those nodes that have a Tukey depth greater than a user-specified threshold. Our empirical results clearly demonstrate that using the right threshold, the probabilistic definition of cores in [15] coincides with our deterministic one. The rest of the paper is organized as follows. In Sec. 2 we first collect the necessary notions and notation. Sec. 3 contains some examples showing that graph Tukey depth is strongly related to such mining and learning algorithms on graphs that rely on graph geodesic convexity. In Sec. 4 we present our heuristic for approximating graph Tukey depth and evaluate it empirically on small graph datasets. Finally, in Sec. 5 we mention some open questions for future research. 2. Preliminaries In this section, we collect the necessary notions and fix the notation. For a graph 𝐺 = (𝑉, 𝐸), 𝑉 (𝐺) and 𝐸(𝐺) denote the set 𝑉 of nodes and the set 𝐸 of edges. Furthermore 𝑛 and 𝑚 denotes |𝑉 (𝐺)| and |𝐸(𝐺)|, respectively. By graphs, we always mean undirected graphs without loops and multi-edges. For any 𝑢, 𝑣 ∈ 𝑉 (𝐺), the (geodesic) interval [𝑢, 𝑣] is the set of all nodes on all shortest paths between 𝑢 and 𝑣 (see Fig. 2a for an example). A set of nodes 𝑋 ⊆ 𝑉 (𝐺) is called (geodesically) closed iff for all 𝑢, 𝑣 ∈ 𝑉 (𝐺), 𝑢, 𝑣 ∈ 𝑋 implies [𝑢, 𝑣] ⊆ 𝑋. The closure 𝜌(𝑋) of a 𝑢 𝑣 𝑢 𝑣 (a) (b) Figure 2: This figure shows (a) the geodesic interval [𝑢, 𝑣], i.e., the nodes on all shortest paths between 𝑢, 𝑣 (blue) and (b) the geodesic closure 𝜌({𝑢, 𝑣}), i.e., the smallest set of nodes that contains 𝑢, 𝑣 and all nodes on all shortest paths between arbitrary node pairs of the set (red). set 𝑋 ⊆ 𝑉 (𝐺) is the smallest closed set containing 𝑋 (see Fig. 2b for an example of 𝜌({𝑢, 𝑣})). The Graph Tukey Depth Problem For a graph 𝐺, the Tukey depth of a node 𝑣 ∈ 𝑉 (𝐺) is defined as follows [3]: Let 𝐶 ⊆ 𝑉 (𝐺) be a closed set of maximum cardinality such that 𝑣 ∈ / 𝐶. Then the Tukey depth of 𝑣 is defined by td(𝑣) := |𝑉 (𝐺)| − |𝐶|. That is, 𝑣 has a high Tukey depth if all closed sets not containing 𝑣 are of small cardinality. We are interested in the Graph Tukey depths problem defined as follows: Given a graph 𝐺 compute td(𝑣) for all 𝑣 ∈ 𝑉 (𝐺). Geodesic Cores Geodesic cores have been defined in [15] in a probabilistic way. Informally, the geodesic core of a graph consists of those nodes which are contained in most geodesic closed sets that are generated by a small number of random nodes. Of course, the core defined in this way can be empty. However, it turns out that this is not the case for interaction graphs (e.g. social networks) [15]. We define the geodesic core of a graph 𝐺, denoted by 𝒞 as follows. Let 𝑋1 , 𝑋2 , . . . be a sequence of sets, where each set consists ⋂︀𝑖 of 𝑘 > 0 nodes selected independently and uniformly at random from 𝑉 (𝐺). Then 𝒞 = 𝑗=1 𝜌(𝑋𝑗 ) is the core of 𝐺, where 𝑖 is the smallest integer satisfying 𝑖𝑗=1 𝜌(𝑋𝑗 ) = 𝑖+1 𝑗=1 𝜌(𝑋𝑗 ). Obviously, this definition is not ⋂︀ ⋂︀ deterministic since different choices of 𝑋𝑗 and 𝑘 can lead to different cores. Nevertheless, the experiments in [16] with large real-world networks show that for 𝑘 ≈ 10, the core (if it exists) does not depend on the particular choice of the generator elements. The core-periphery decomposition of a graph is composed of the subgraph induced by the core nodes and that by the remaining nodes, called periphery. In Fig. 3 we give a visual example of the core-periphery decomposition of the CA-GrQc network [19]1 . MCSS problem for Graphs We will use the maximal closed set separation (MCSS) problem restricted to geodesic closures over graphs (see [14] for the generic definition): Given a graph 𝐺 = (𝑉, 𝐸) and 𝐴, 𝐵 ⊆ 𝑉 (𝐺) with disjoint geodesic closures, find two geodesic closed sets 𝐴′ , 𝐵 ′ ⊆ 𝑉 (𝐺) with 𝐴′ ∩ 𝐵 ′ = ∅ and 𝐴 ⊆ 𝐴′ , 𝐵 ⊆ 𝐵 ′ that are maximal, i.e., there exist no proper supersets of 𝐴′ resp. 𝐵 ′ that fulfill the above properties. 1 This network is built by the co-authorships in the general relativity and quantum cosmology community. (a) Entire Network (b) (Geodesic) Core (c) Periphery Figure 3: Example of a geodesic core-periphery decomposition (c.f. [16]) (a) the CA-GrQc network [19] with core in orange and periphery in blue, (b) its (geodesic) core, (c) its periphery. 3. Potential Applications to Mining and Learning with Graphs Before we turn to the algorithmic aspects of the graph Tukey depth problem, in this section we first present some of its potential applications to mining and learning with graphs. In particular, this section deals with some connections of graph Tukey depth to node separations and to geodesic core-periphery decompositions. We state three important properties of Tukey depth. In particular, Proposition 1 clarifies the role of Tukey depth in the context of geodesic closed sets. Proposition 1. Let 𝐺 be a graph, 𝑣 ∈ 𝑉 (𝐺) with td(𝑣) = 𝑛 − 𝑐, and 𝐶 ⊆ 𝑉 (𝐺) a geodesically closed node set with |𝐶| > 𝑐. Then 𝑣 ∈ 𝐶. Proposition 2 ([3]). Let 𝐺 be a graph, 𝑋 ⊆ 𝑉 (𝐺), and 𝐶 be the closure of 𝑋. Then the graph Tukey depth is a quasi-concave function, i.e., for all 𝑐 ∈ 𝐶 we have 𝑡𝑑(𝑐) ≥ min{𝑡𝑑(𝑥) : 𝑥 ∈ 𝑋} . Proposition 3 ([3]). Let 𝐺 be a graph, 𝑘 ∈ N, and 𝑋 = {𝑣 ∈ 𝑉 (𝐺) : td(𝑣) ≥ 𝑘}. Then 𝑋 is geodesically closed. To underline the importance of these three statements, we give two examples that show how they (can) influence machine learning and data mining methods based on geodesic closures. Example 1: Node Classification and Active Learning In [14, 20, 21, 22], disjoint half- spaces and closed sets are used for binary classification in closure systems, for node classification, and active learning in graphs using geodesic convexity. Given the Tukey depth td(𝑣) of a node 𝑣, Proposition 1 immediately implies that a separating half-space or closed set not containing 𝑣 cannot have a cardinality greater than 𝑛 − td(𝑣). Thus, for nodes of high Tukey depth, there is no large geodesic closed set not containing them. Hence, Proposition 1 implies a nice theoretical connection between graph Tukey depth and the maximum size of separating half-spaces and closed sets. Using approximate graph Tukey depths, the predictive performance of all the above methods can possibly be improved. Karate Club Les Miserables Dolphins Tukey Depth small depths large depths Geod. Core Figure 4: Tukey depth (top) vs. geodesic core-periphery decomposition (bottom) for the Karate Club [23], Les Miserables character [24], and Dolphins social networks [25]. For the different Tukey depths we use sequential colors. Core and periphery nodes are denoted by blue and yellow, respectively. Example 2: Geodesic core-periphery decomposition The geodesic core-periphery decom- position of graphs was analyzed in [15, 16, 17]. In particular, it was found in [15, 17] that many social networks consist of a dense geodesic core “surrounded” by a sparse periphery (see Fig. 3 for an example). While some graphs, especially tree-like graphs, seem to have no core, others, such as graphs sampled from random graph models (e.g., Erdős-Rényi, Barabási-Albert and Watts-Strogatz) seem to have no periphery. Moreover, the closure of a small number of randomly chosen graph nodes (≈ 10) always contains the geodesic core (if it exists). Furthermore, if the nodes are sampled from the geodesic core only, then the closure of the nodes is the geodesic core itself. If we compute the closure of, say, 10 randomly chosen nodes from the entire network (Fig. 3a), then the closure always contains the core (orange nodes in Fig. 3b). If all random nodes belong to the core (orange nodes in Fig. 3a), then their closure is the core itself. The above statements explain this behavior. Using that the core is always contained in the closure of a small number of randomly chosen nodes, from Proposition 2 it follows that the nodes in the core are those with the highest Tukey depths. Moreover, the quasi-concave property of graph Tukey depth implies that if the core is generated by a few nodes from the core, then the core nodes must have a very close Tukey depth. Finally, using Proposition 3, we have that the set of nodes in a graph with a Tukey depth above some threshold is always geodesically closed; cores arise as a special case of this property. These three properties motivate the following deterministic definition of geodesic cores: Definition 1. The 𝑘-geodesic core of a graph 𝐺 is defined by 𝐶𝑘 := {𝑣 ∈ 𝑉 (𝐺) : td(𝑣) ≥ 𝑘} . To empirically confirm our claim that the core contains the nodes with the highest Tukey depths, consider the three graphs in Fig. 4. We computed the exact Tukey depths (top) and their geodesic cores (bottom). For the Karate Club network (left) considered also in [3], the core exactly matches the set of nodes of the highest Tukey depth. Furthermore, there is not much fluctuation in the depths of the core nodes. In fact, all nodes of Tukey depth of at most 3 belong to the periphery and all nodes of Tukey depth 19 or 21 to the core, by noting that there are no nodes of Tukey depth between 4 and 18. In case of the Les Miserables character network (middle), there is only a single node with a very high Tukey depth of 57, surrounded by nodes of depth less than 35. In this case, the core algorithm returns only the node with the highest Tukey depth, showing that the graph Tukey depth can possibly be used to improve core-periphery decompositions. This is also the case for the Dolphin community graph (right), where the core consists of nodes with Tukey depth greater than 2, while all nodes in the periphery have a Tukey depth of at most 2. 4. Approximating the Tukey Depth Motivated by the negative complexity result concerning the calculation of Tukey depth, in Sec. 4.1 below we first propose a heuristic based on the algorithm in [14] that solves the MCSS problem. It approximates the nodes’ Tukey depths with one-sided error. We show experimentally on different types of small graphs that the results obtained by our heuristic are fairly close to the exact ones. Furthermore, even on small graphs, our algorithm is up to 200 times faster than the exact one (Sec. 4.2). It is important to emphasize that we had to resort to such graphs, as it was not possible to calculate the exact Tukey depths for larger graphs in a practically feasible time. 4.1. The Heuristic Recall that the exact Tukey depth of a node 𝑣 is defined by td(𝑣) := |𝑉 (𝐺)| − |𝐶|, where |𝐶| is the maximum cardinality of a closed set 𝐶 not containing 𝑣. It can be computed exactly using an integer linear program (see [3] for the details). The computationally hard part of the problem is to find a closed set of maximum size. Our heuristic addresses this problem by considering an inclusion maximal closed set only, instead of a maximum cardinality closed set. This relaxation, which distorts of course the exact value of Tukey depth, allows us to apply the efficient greedy algorithm proposed in [14] for solving the MCSS problem. In what follows, for any 𝑣 ∈ 𝑉 (𝐺), td(𝑣) ̂︀ denotes the approximation of td(𝑣) obtained with our heuristic. Given a graph 𝐺, the rough idea to approximate the Tukey depth of a node 𝑣 ∈ 𝑉 (𝐺) is to find an inclusion maximal geodesically closed set 𝐶 ⊆ 𝑉 (𝐺) with 𝑣 ∈ / 𝐶. Such a set 𝐶 can be found by applying the MCSS algorithm [14] with input {𝑣} and {𝑣 ′ }. Then the output of the algorithm is a solution of the MCSS problem, i.e., it consists of two closed node sets 𝐻𝑣 , 𝐻𝑣′ ⊆ 𝑉 (𝐺) with 𝑣 ∈ 𝐻𝑣 and 𝑣 ′ ∈ 𝐻𝑣′ that are disjoint and inclusion maximal. That is, there exist no proper closed supersets of 𝐻𝑣 , 𝐻𝑣′ with the same properties. The Tukey depth can then be approximated using the cardinalities of 𝐻𝑣 resp. 𝐻𝑣′ . For a fixed node 𝑣, the result depends on the particular choice of 𝑣 ′ . To improve the approximation quality, we therefore call the MCSS algorithm for each node 𝑣 several times, with different nodes 𝑣 ′ ̸= 𝑣. The pseudo-code of the above heuristic is given in Alg. 1. In Line 1 we initialize the Tukey depth of all nodes in 𝐺 by setting them to the maximum possible value, i.e., to |𝑉 (𝐺)|. We repeat the procedure described above for all nodes 𝑣 ∈ 𝑉 (𝐺) and all their neighbors 𝑣 ′ ∈ Γ(𝑣) (see the Algorithm 1: Approximation of Graph Tukey Depth Input : graph 𝐺 Output : approximation td(𝑣) ̂︀ of td(𝑣) for all 𝑣 ∈ 𝑉 (𝐺) 1 td(𝑣) ̂︀ ←− |𝑉 (𝐺)| for all 𝑣 ∈ 𝑉 (𝐺); 2 forall 𝑣 ∈ 𝑉 (𝐺) do 3 forall 𝑣 ′ ∈ Γ(𝑣) do 4 𝐻𝑣′ , 𝐻𝑣 = 𝑀 𝐶𝑆𝑆({𝑣 ′ }, {𝑣}); 5 forall 𝑥 ∈ 𝑉 (𝐺) do 6 if 𝑥 ̸∈ 𝐻𝑣′ then 7 td(𝑥) ̂︀ = min{td(𝑥), ̂︀ |𝑉 (𝐺)| − |𝐻𝑣′ |}; 8 if 𝑥 ̸∈ 𝐻𝑣 then 9 td(𝑥) ̂︀ = min{td(𝑥), ̂︀ |𝑉 (𝐺)| − |𝐻𝑣 |}; 10 return td(𝑣) for all 𝑣 ∈ 𝑉 (𝐺) ̂︀ for-loops in Line 2 and 3). In this way we solve the MCSS problem for all input sets {𝑣}, {𝑣 ′ }, i.e., separate 𝑣 from all of its neighbors 𝑣 ′ by maximal disjoint closed sets 𝐻𝑣 , 𝐻𝑣′ (see Line 4). Note that the Tukey depth of a node 𝑥 is based on a closed set of maximum cardinality not containing 𝑥. Thus, if 𝑥 does not lie in the set 𝐻𝑣 (resp. 𝐻𝑣′ ), then the cardinality of 𝐻𝑣 (resp. 𝐻𝑣′ ) is smaller than or equal to a closed set of maximum cardinality not containing 𝑥. Hence, we can update the current Tukey depth approximation of all nodes 𝑥 ∈ 𝑉 (𝐺) as follows: Take the minimum over the old and the new approximation which is the cardinality of 𝑉 (𝐺) ∖ 𝐻𝑣 if 𝑥∈ / 𝐻𝑣 or that of 𝑉 (𝐺) ∖ 𝐻𝑣′ if 𝑥 ∈ / 𝐻𝑣′ (see Line 7 and Line 9). By construction, Alg. 1 finds only maximal and not maximum closed sets, resulting in a one-sided error in the estimate of Tukey depths. This is formulated in the proposition below. Proposition 4. Alg. 1 overestimates the Tukey depth, i.e., for the output td(𝑣) ̂︀ returned by Alg. 1 we have td(𝑣) ≥ td(𝑣), for all 𝑣 ∈ 𝑉 (𝐺). ̂︀ Regarding the runtime of Alg. 1, note that the inner part of the loop in Lines 3-9 is executed 𝑂 (𝑚) times because we iterate over all neighbors (i.e., all edges are considered twice). The runtime of the inner loop (Lines 3–9) is dominated by the MCSS algorithm called in Line 4, which calls the closure operator at most 𝑂 (𝑛) times [14]. Since the geodesic closure can be computed in time 𝒪(𝑚𝑛) [26], we have the following result for the total runtime of Alg. 1: Proposition 5. Alg. 1 returns an upper bound of the Tukey depth for all nodes of 𝐺 in 𝒪(𝑚2 𝑛2 ) time. The runtime of the approximation algorithm can be improved by considering for each node 𝑣 a fixed number of distinct nodes 𝑣 ′ , or by considering a fixed subset 𝑊 ⊆ 𝑉 (𝐺), instead of the whole node set 𝑉 (𝐺) in the outer loop (see Lines 2–9). It is left to further research to analyze how these changes affect the quality of the approximation performance. 4.2. Experimental Evaluation In this section we empirically evaluate the approximation quality and runtime of Alg. 1 on datasets containing small graphs2 . Regarding the approximation quality, we compare the results obtained by our heuristic algorithm to the exact Tukey depths computed with the algorithm in [3]. For the evaluation, we consider 19 graph datasets of different types (small molecules, small graphs from bioinformatics and computer vision, and small social networks) from [2] (see columns 2–4 of Tab. 1 for the number of graphs and their average number of nodes and edges). The average size of the graphs ranges from 14 (PTC_MM) up to 82 (OHSU ); their average edge numbers from 14 to 200. The reason for considering small graphs only is that the exact algorithm [3] was unable to calculate the Tukey depth for larger graphs within one day (see the last two columns of Tab. 1). For practical reasons, we removed all disconnected graphs from the original datasets, by noting that our heuristic works for disconnected graphs as well. The results are presented in Tab. 1. It contains the approximation qualities measured in different ways (columns 5–10) and the runtime of the exact (column 11) and our heuristic algorithm (column 12). The datasets are sorted according to their absolute approximation error (column 5 of Tab. 1), i.e., the sum of all differences between the approximation and the exact Tukey depth over all nodes and all graphs in the dataset. Regarding the absolute error (column 5), our approximation results are equal to the exact Tukey depths for 5 out of the 19 datasets, while their computation was faster by a factor of up to 100 (see PTC_MM). Our algorithm has the largest absolute error of 4155 on the COIL-DEL graphs, by noting that this dataset consists of 3900 graphs. Hence, the average error per graph is only slightly above one. Additionally, we look at the relative errors (column 6), i.e., the absolute error divided by the sum of all depths. We use this measure to validate that our algorithm performs very well, by noting that the relative errors are below 4 · 10−3 for all graph datasets. The per node error (column 7) is the average error our algorithm makes per node, while the per graph error (column 8) is the error it has on average per graph. Regarding the per node error, the worst-case is for the COIL-DEL dataset (last row) with an average error of 0.05. For the per graph error, the worst result was obtained for the OHSU dataset, where we overestimate the sum of all node depths by 1.65 per graph on average. This shows that our approximation algorithm performs very well, especially, if considering the average results over the datasets. Finally, we studied also the worst-case approximations for nodes and graphs. In particular, the columns Max. Node Error resp. Max. Graph Error denote the maximum error of the algorithm on single nodes resp. on single graphs. The results show that there is a very low error of at most 3 per node for 13 out of the 19 datasets. For three graph datasets, the maximum error per node is at most 7 and we have a maximum error between 11 and 19 in three cases. Regarding the maximum error per graph, a similar behavior can be observed by noting that except for OHSU and Peking_1, the maximum node errors and maximum graph errors are close to each other. This implies that there are only a few nodes with a high approximation error. It is an interesting problem to find the structural properties of such nodes and graphs that are responsible for the high approximation errors. The last two columns show the runtimes of the two algorithms. Our algorithm (last column) is faster than the exact one by at least one order of magnitude. In summary, the evaluation of Alg. 1 clearly demonstrates that our heuristic performs well in 2 See https://github.com/fseiffarth/AppOfTukeyDepth for the code. Data Number Avg. Avg. Error Error Error Error Max. Max. Exact Approx. of Nodes Edges (abso- (rela- per per Node Graph Run- Run- Graphs lute) tive) Node Graph Error Error time (s) time (s) BZR 405 35.75 38.36 0 0 0 0 0 0 56.60 1.04 PTC_MM 336 13.97 14.32 0 0 0 0 0 0 21.09 0.21 COX2 467 41.22 43.45 0 0 0 0 0 0 76.10 1.27 Cuneiform 267 21.27 44.80 0 0 0 0 0 0 2.00 0.61 DHFR 756 42.43 44.54 0 0 0 0 0 0 266.33 3.19 PTC_FR 351 14.56 15.00 1 4.50e-05 1.96e-04 2.85e-03 1 1 23.81 0.25 PTC_FM 349 14.11 14.48 1 4.80e-05 2.03e-04 2.86e-03 1 1 20.64 0.22 MUTAG 188 17.93 19.79 1 6.50e-05 2.97e-04 5.32e-03 1 1 3.01 0.09 PTC_MR 344 14.29 14.69 2 9.20e-05 4.07e-04 5.81e-03 1 1 23.29 0.23 KKI 83 26.96 48.42 12 1.01e-03 5.36e-03 1.45e-01 2 4 40.45 2.43 IMDB-BINARY 1000 19.77 96.53 19 4.37e-04 9.61e-04 1.90e-02 1 3 723.07 113.14 NCI1 3530 29.27 31.88 34 5.20e-05 3.29e-04 9.63e-03 6 8 1194.63 7.76 Peking_1 85 39.31 77.35 40 1.30e-03 1.20e-02 4.71e-01 3 10 4761.33 48.08 MSRC_21C 209 40.28 96.60 86 1.28e-03 1.02e-02 4.11e-01 12 12 51.78 3.06 MSRC_9 221 40.58 97.94 89 1.21e-03 9.92e-03 4.03e-01 7 8 49.33 2.92 OHSU 79 82.01 199.66 130 8.54e-04 2.01e-02 1.65e+00 2 13 42887.32 235.40 ENZYMES 569 31.68 61.44 307 1.68e-03 1.70e-02 5.40e-01 7 10 933.79 8.25 MSRC_21 563 77.52 198.32 877 1.39e-03 2.01e-02 1.56e+00 19 25 3679.24 58.98 COIL-DEL 3900 21.54 54.24 4155 4.05e-03 4.95e-02 1.07e+00 11 17 2242.05 43.00 Table 1 Graph data of different sizes selected from [2]. Disconnected graphs are removed from the original datasets. The columns regarding the approximation quality denote the following. Error (absolute) denotes the overall error on the dataset, Error (relative) denotes the relative error regarding the depths, Error per Node denotes the average error per node, Error per Graph denotes the average error per graph, Max. Node Error denotes the maximum error for a node and Max. Graph Error denotes the maximum error on a graph. The last two columns show the runtimes of the exact and approximation algorithm in seconds. approximating the graph Tukey depth. It is faster (sometimes more than 100 times) than the exact algorithm, even on small graph datasets. Regarding larger graphs, this gap in runtime will increase because of the exponential runtime of the exact algorithm. Additionally, the very small relative errors (at most 4 · 10−3 ), the average errors (at most 1.65 per graph), and also the worst case errors show that the algorithm can be used effectively for further applications based on the Tukey depth (see Sec. 3). 5. Concluding Remarks Graph Tukey depth appears an interesting and promising new concept for mining and learning with graphs. The study of the relationship of graph Tukey depth to other node centrality measures is an interesting question for further research (cf. Fig. 1). For example, while the centroid(s) in trees [27, 28] are exactly the nodes with the highest Tukey depth, this is not necessarily the case for more general graphs beyond trees. Another important issue is to understand the semantics of graph Tukey depth better. For example, what are the properties of the nodes with the highest depth (cf. the def. of Tukey- median in R𝑑 )? We have empirically demonstrated that graph Tukey depth can closely be approximated in small graphs. It is a question of whether this result holds for (very) large graphs as well. To answer this question, the scalability of our approximation algorithm should be improved on the one hand. On the other hand, one needs (possibly tight) theoretical upper bounds on graph Tukey depths. Another interesting question is to identify graph classes for which our heuristic always results in the exact graph Tukey depth. While this is the case for trees, it is unclear whether it holds for outerplanar graphs as well. We believe that this question can be answered affirmatively by using techniques from [16]. We show experimentally that there is a connection between geodesic core-periphery decompositions and the deterministic definition of 𝑘-geodesic cores. This suggests that using our core approximation algorithm [16], we can closely approximate the set of nodes with the highest Tukey depth. Acknowledgments This research has been funded by the Federal Ministry of Education and Research of Germany and the state of North-Rhine Westphalia as part of the Lamarr-Institute for Machine Learning and Artificial Intelligence, LAMARR22B. The authors gratefully acknowledge this support. References [1] M. Newman, Networks: An Introduction, Oxford University Press, Inc., USA, 2010. [2] C. Morris, N. M. Kriege, F. Bause, K. Kersting, P. Mutzel, M. Neumann, Tudataset: A collection of benchmark datasets for learning with graphs, in: ICML, GRL Workshop, 2020. [3] J. O. Cerdeira, P. C. Silva, A centrality notion for graphs based on tukey depth, Appl. Math. Comput. 409 (2021) 126409. [4] D. L. Donoho, M. Gasko, Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness, The Annals of Statistics 20 (1992) 1803 – 1827. [5] J. W. Tukey, Mathematics and the picturing of data, Proceedings of the International Congress of Mathematicians, Vancouver, 1975 2 (1975) 523–531. [6] F. Rosenblatt, The perceptron: A probabilistic model for information storage and organi- zation in the brain, Psychological Review 65 (1958) 386–408. [7] B. E. Boser, I. M. Guyon, V. N. Vapnik, A training algorithm for optimal margin classifiers, COLT, ACM, 1992, pp. 144–152. [8] X. Dai, S. Lopez-Pintado, for the Alzheimer’s Disease Neuroimaging Initiative, Tukey’s depth for object data, Journal of the American Statistical Association (2022) 1–13. [9] P. Mozharovskyi, Contributions to depth-based classification and computation of the Tukey depth, Ph.D. thesis, University of Cologne, 2015. [10] R. Gilad-Bachrach, A. Navot, N. Tishby, Bayes and tukey meet at the center point, in: Learning Theory, Springer Berlin Heidelberg, Berlin, Heidelberg, 2004, pp. 549–563. [11] C. Becker, U. Gather, The masking breakdown point of multivariate outlier identification rules, Journal of the American Statistical Association 94 (1999) 947–955. [12] D. Bremner, D. Chen, J. Iacono, S. Langerman, P. Morin, Output-sensitive algorithms for tukey depth and related problems, Statistics and Computing 18 (2008) 259–266. [13] D. Johnson, F. Preparata, The densest hemisphere problem, Theor. Comput. Sci. 6 (1978) 93–107. [14] F. Seiffarth, T. Horváth, S. Wrobel, Maximal closed set and half-space separations in finite closure systems, in: ECML PKDD 2019, volume 11906 of LNCS, Springer, 2019, pp. 21–37. [15] M. Tilen, L. Šubelj, Convexity in complex networks, Network Science 6 (2018) 176–203. [16] F. Seiffarth, T. Horváth, S. Wrobel, A fast heuristic for computing geodesic cores in large networks, 2022. URL: https://arxiv.org/abs/2206.07350. [17] L. Šubelj, Convex skeletons of complex networks, J. R. Soc. Interface 15 (2018). [18] L. Šubelj, D. Fiala, T. Ciglaric, L. Kronegger, Convexity in scientific collaboration networks, Journal of Informetrics 13 (2019) 10–31. [19] J. Leskovec, A. Krevl, SNAP Datasets: Stanford large network dataset collection, http: //snap.stanford.edu/data, 2014. [20] P. H. M. de Araújo, M. B. Campêlo, R. C. Corrêa, M. Labbé, The geodesic classification problem on graphs, volume 346 of Elec. Notes Theor. Comput. Sci., Elsevier, 2019, pp. 65–76. [21] F. Seiffarth, T. Horváth, S. Wrobel, Maximum margin separations in finite closure systems, in: ECML PKDD 2020, volume 12457 of LNCS, Springer, 2020, pp. 3–18. [22] M. Thiessen, T. Gärtner, Active learning of convex halfspaces on graphs, in: Advances in Neural Information Processing Systems, 2021. [23] W. W. Zachary, An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33 (1977) 452–473. [24] D. E. Knuth, The Stanford GraphBase: A Platform for Combinatorial Computing, Associa- tion for Computing Machinery, 1993. [25] D. Lusseau, The emergent properties of a dolphin social network, Proceedings of the Royal Society of London. Series B: Biological Sciences 270 (2003). [26] I. M. Pelayo, Geodesic Convexity in Graphs, Springer New York, 2013. [27] W. Piotrowski, A generalization of branch weight centroids, Appl. Math. 19 (1987) 541–545. [28] W. Piotrowski, M. M. Syslo, Some properties of graph centroids, Ann. Oper. Res. 33 (1991) 227–236.