Semi-supervised Community Detection in Dynamic Graphs Matteo Bianco, Luca Cagliero* and Luca Vassio Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129 Torino TO, Italy Abstract Semi-supervised approaches to Community Detection (CD) in graphs aim to detect communities closely related to a few labeled ones. State-of-the-art semi-supervised algorithms adopt a three-step process, which entails (1) Generating candidate communities based solely on the network structure; (2) Selecting the candidates that are most similar to the labeled communities; (3) Refining the selected communities shortlisted at Step (2). However, existing approaches are unsuited to handle the dynamics in labeled communities and their relations with time-varying graph structures. In this work, we formulate the new task of semi-supervised CD from dynamic graphs, which is relevant to real-world time-evolving scenarios. To avoid executing the previous pipeline independently at every time step and potentially missing relevant temporal community-level relations, we envisage a new approach relying on time-aware strategies for both dynamic graph embedding and community selection and refinement. We leverage a latent graph representation incorporating node- and subgraph-level temporal relations neglected by static approaches. Then, supervised community refinements are propagated across consecutive time steps to capture time-evolving trends. After adapting static CD models to the dynamic scenario, we conduct extensive comparisons of the methods on datasets with varying characteristics in the novel task of dynamic semi-supervised CD. The proposed approach shows remarkable improvements in low-modularity and low-stability dynamic graphs. Keywords Community Detection, Dynamic Graphs, Semi-Supervised Learning 1. Introduction between graph snapshots and time-varying labeled commu- nities are jointly analyzed. Communities in graphs are groups of nodes with distinctive features or connections [1]. Automating the process of Com- Contribution Firstly, we formalize the Semi-Supervised munity Detection (CD) from graphs is a well-established Dynamic Community Detection (SS-DCD) task (Section 2). machine learning problem. It has found application in sev- Unlike the unsupervised task, here the CD process is guided eral fields among which social network analysis [2], scien- by labeled communities. Secondly, we propose an approach tometrics [3], and healthcare [4]. to tackle the newly proposed SS-DCD task (Section 3). Un- To deeply explore the network graph structure, unsu- like all existing works (e.g., [9, 10]), our approach leverages pervised approaches to CD have attempted to use a vari- temporal graph embeddings to incrementally update the ety of classical data mining or Deep Learning techniques dynamic graph representation. Furthermore, the steps of such as clustering [5], graph mining [6], and Variational supervised community selection and refinement are time- AutoEncoders [7]. However, they struggle to find communi- aware. The aim is to consider not only the temporal relations ties of nodes with functional relations that are not directly between network graphs in the different snapshots but also derivable from the network structure [8]. To tackle this the evolution of the labeled communities. Lastly, we empiri- issue, Semi-Supervised CD (SS-CD) approaches leverage a cally compare the performance of our approach and baseline few examples of labeled communities, typically provided methods under the unexplored SS-DCD setting (Section 5). by domain experts. To efficiently address SS-CD on large graphs, state-of-the-art approaches (e.g., [9, 10, 11]) first Evaluation We adapt real static graphs with ground-truth encode graph nodes or subgraphs using graph embedding communities and various topological characteristics to a dy- techniques. Next, they extract candidate communities based namic scenario (Section 4). On top of dynamic graphs, we solely on the network structure. Finally, they shortlist and assess both our approach and baseline methods in terms of refine the extracted candidates by maximizing their similar- the established F1, Jaccard, and ONMI performance scores. ity with the labeled communities in the embedding space. The results show that our approach performs on average Since real-world communities are naturally time- the best on the analyzed datasets in terms of F1, Jaccard, and evolving, there is an increasing need to extend existing CD ONMI scores (19 wins out of 27 combinations of datasets solutions suited to static graphs towards dynamic scenarios. and metrics). The results are mainly influenced by the net- Recent approaches to CD capture time-evolving trends in work modularity and the level of dynamics in the sequence dynamic graphs by learning temporal graph embeddings. of graph snapshots and corresponding labels. Specifically, They either learn temporal relations in sequences of graph when the graph has a high modularity the communities are snapshots [12, 13, 14] or rely on parametric distance dy- relatively easy to detect from the current network topology namic models [15]. However, to the best of our knowledge, regardless of its past snapshots and labeled communities. all prior works on CD from dynamic graphs are unsuited The experiments confirm that the lower the modularity to a semi-supervised scenario where labeled communities the higher the benefits of the newly proposed time-aware change over time. This calls for new approaches addressing semi-supervised strategy. Similarly, the temporal stability SS-CD from dynamic graphs, in which temporal relations of the network [16] is another important indicator of the level of complexity of the SS-DCD task. Our results confirm Published in the Proceedings of the Workshops of the EDBT/ICDT 2025 the better suitability of the proposed approach to dynamic Joint Conference (March 25-28, 2025), Barcelona, Spain scenarios compared to state-of-the-art approaches. * Corresponding author. $ matteo.bianco@studenti.polito.it (M. Bianco); luca.vassio@polito.it (L. Cagliero); luca.vassio@polito.it (L. Vassio)  0000-0002-7185-5247 (L. Cagliero); 0000-0002-2920-1856 (L. Vassio) © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 2. Problem statement 1. Candidate community generation, aimed to extract candidate communities based on the structure of the We first introduce the preliminary concepts and related nota- network graph. tion and then formalize the newly proposed Semi-Supervised 2. Supervised candidate selection, which reduces the set Dynamic Community Detection (SS-DCD) task. of candidates generated at the previous step accord- Let G 𝑡 =(𝑉 𝑡 ,𝐸 𝑡 ) be a graph where 𝑉 𝑡 and 𝐸 𝑡 are the ing to their similarity with the labeled communities. corresponding sets of nodes and edges, respectively. A com- 3. Community refinement, aimed to revise the gener- munity c 𝑡 detected from G 𝑡 is a subset of the nodes with ated communities, e.g., by dropping or adding nodes peculiar characteristics in terms of either network graph or edges. structure or node properties. Given a graph G 𝑡 , the Unsu- pervised Community Detection (U-CD) task has the goal of However, all the above-mentioned steps ignore the tem- extracting the set C 𝑡 of all its communities without any poral evolution of both graph structure and communities prior knowledge on the community properties, i.e., U-CD which is peculiar to the newly proposed SS-DCD task. exclusively relies on the network graph structure. When, Figure 1 shows how to naively use the classical SS-CD instead, the extraction process is guided by a given set C ^𝑡 pipeline in an SS-DCD task. For the sake of simplicity, we 𝑡 of labeled communities in G , the task is commonly de- exemplify the CD process across two consecutive time steps noted by Supervised Community Detection (S-CD). The key only, namely 𝑡 − 1 and 𝑡. The temporal graph snapshots idea behind S-CD is to detect communities in graphs that G 𝑡−1 and G 𝑡 are separately embedded to generate the can- are closely related to the labeled ones by learning a model didate communities. Next, the candidate selection and re- that automatically captures similarities among subgraphs. finement processes are executed independently for every Importantly, S-CD is not necessarily based on the network time step. This existing pipeline has two main drawbacks: structure solely but might consider functional properties of nodes or edges as well [8]. In this work, we assume that the • The process of generation of the candidate communi- CD task is partially supervised, i.e., the input set of labeled ties disregards the temporal relations among graph communities is incomplete. Given a graph G 𝑡 and a train- snapshots and related communities. Consequently, ing set C^ 𝑡 consisting of few labeled communities in G 𝑡 , we the next supervised candidate selection step could be biased. denote by Semi-Supervised Community Detection (SS-CD) the task of automatically detecting all the communities in • The community refinement step is unaware of the G 𝑡. outcomes of the supervised community detection We aim to model the temporal variations of a graph struc- and refinement steps. Hence, it potentially disre- ture and its underlying communities within a reference time gards all past (labeled or detected) community up- period. Without loss of generality, we assume that the ref- dates as well as the temporal relations with the sur- erence period is divided into 𝑇 discrete consecutive time rounding network. steps (i.e., 𝑡 ∈ {1, 2, . . . , 𝑇 }). Let us consider, for example, the labeled community ˆ𝑐𝑡−1 Dynamic Community Detection (DCD) aims to detect at time step 𝑡 − 1 consisting of nodes 𝑣1 and 𝑣3 (𝑐ˆ𝑡−1 = all communities from sequences of graph snapshots 𝒢 = {𝑣1 , 𝑣3 }). The path connecting 𝑣1 and 𝑣3 changes at time {G 𝑡 }𝑇𝑡=1 . Unsupervised approaches to DCD extract commu- step 𝑡 by including also node 𝑣4 . Knowing both the past nities from G 𝑡 for every 𝑡 ∈ {1, 2, . . . , 𝑇 } in the absence of community composition and the new network structure labeled communities. Conversely, Supervised DCD (S-DCD) allows the CD algorithm to learn the temporal relation with leverages the set C ^ 𝑡 of labeled communities occurring in its updated version ˆ𝑐𝑡 = {𝑣1 , 𝑣3 , 𝑣4 }. Analogously, the each time step 𝑡. However, similar to the time-invariant exclusion of node 𝑣2 from ˆ𝑐𝑡−1 is a valuable hint for the case, the set of labeled communities is likely incomplete definition of its updated version at time step 𝑡. due to the lack of human annotations or reliable community To overcome the aforesaid limitations, we propose a new descriptors. To tackle the above issue, we formalize the SS-DCD pipeline (see Figure 2). Compared to the standard new task of Semi-Supervised Dynamic Community Detec- pipeline, we incorporate the following two additional fea- tion (SS-DCD, in short). The twofold aims are, on the one tures making the SS-CD pipeline end-to-end time-aware. hand, to make SS-CD time-aware by leveraging CD semi- supervision and, on the other hand, to enrich Unsupervised Time-aware graph embeddings Rather than learning DCD with a combined analysis of the properties of both independent embedding matrices for every snapshot 𝑡 in the network structure and the labeled communities across 𝒢, we compute a time-aware embedding representation different snapshots. H𝑡 of the previous graph snapshots G 1 , . . ., G 𝑡 capturing time-varying properties of the network graph structure. SS-DCD task formulation Given a sequence of graph The embedding representation is incrementally updated snapshots 𝒢 and a partial set C ^ 𝑡 of labeled communities at each time step and, importantly, is directly fed to the occurring at every time step 𝑡 ∈ {1, 2, . . . , 𝑇 }, the SS-DCD supervised candidate selection step to allow time-aware task aims is to extract the sequence 𝒞 = {C 𝑡 }𝑇𝑡=1 of com- semi-supervision (see the time-aware supervised candidate munity sets C 1 , . . . , C 𝑇 . selection step in Figure 2). The designed SS-DCD pipeline is agnostic to the encoder used to compute the embedding matrix. For incremental 3. Semi-supervised Community learning of node embeddings, our current implementation Detection from Dynamic Graphs leverages the ROLAND node embeddings [13] (more details are given in Section 5.3). The classical SS-CD pipeline entails the following steps: Refined Refined hood of 𝑐𝑡 . We define the initial state of the RL as the union Communities Communities of the community with its neighborhood (𝑐𝑡 ∪ 𝑁𝑐𝑡 ). The Ct-1 Ct state representation consists of the time-aware embeddings of the composing nodes, i.e., H𝑐𝑡 ∪𝑁𝑐𝑡 Similar to CLARE [9], we define two policy networks, Community refinement Community refinement consisting of separate MultiLayer Perceptrons, to decide Selected Selected whether to execute any of the following refinement actions Communities Communities Ct-1 Ct on the community 𝑐𝑡 : • Expansion, which adds a new node to the community taken from its neighborhood; Supervised candidate selection Supervised candidate selection • Reduction, which excludes nodes that are already Candidate Candidate part of the community, Communities Communities C̃ t-1 C̃ t Each action is rewarded according to the F1-Score im- Embedding Embedding provement achieved compared to the previous iteration. matrix matrix The ExpMLP network returns the probabilities of adding v2 v3 c t to 𝑐𝑡 a new node from the community neighborhood 𝑁𝑐𝑡 , v2 v3 v5 whereas ReductMLP estimates the likelihood of removing v5 any node from 𝑐𝑡 . To make the community refinement v1 v4 v1 v4 process time-aware, both networks are fed with the time- t-1 c snapshot Gt-1 Elapsing time snapshot Gt aware embeddings. The action probability distributions are defined as follows. Figure 1: Separate application of the existing SS-CD pipeline to (︁ )︁ two consecutive time steps. 𝑃 (action=reduction of 𝑐𝑡 ) = softmax ReductMLP(H𝑐𝑡 ∪𝑁𝑐𝑡 ) (︁ )︁ Refined Communities 𝑃 (action=expansion of 𝑐𝑡 ) = softmax ExpMLP(H𝑐𝑡 ∪𝑁𝑐𝑡 ) C The aforesaid probabilities distributions are propagated to the supervised community selection stage to initialize Community refinement Refinements the CD process at the next time step (see the refinements’ feedforward propagation feedforward propagation arrow in Figure 2). Selected Communities C 4. Dynamic graphs In this section, we introduce the generator used to create Time-aware supervised candidate selection dynamic graphs with time-evolving labeled communities and the datasets used in the experiments. Candidate Communities C̃ 4.1. Synthetic Generator of Dynamic Temporal Embeddings Graphs with Labeled Communities matrix v2 v3 v2 v3 Ct CD algorithms are commonly tested on both real-world v5 v5 and synthetic data [1]. Real benchmarks including ground- v1 v4 Incremental updating truth communities (e.g., [17, 18]) are mostly static. Few of v1 v4 Ct-1 them consist of dynamic graphs with annotations (e.g., [19, snapshot Gt-1 Elapsing time snapshot Gt 20]) but include a relatively low number of ground-truth communities (< 10) and nodes (on average < 100) thus Figure 2: The proposed SS-DCD pipeline applied to two consec- hindering their use for testing Deep Learning techniques. utive time steps. Synthetic generators (e.g., [21, 16]) provide ground-truth communities generated by reference external models, thus making the process of semi-supervision unrealistic. Feed-Forward propagation of community refinements To bridge the gap, we extend a synthetic generator of To tailor community refinement to a dynamic scenario (1) dynamic graphs [21] to simulate the temporal variations We leverage time-aware embeddings to also consider the of real graphs and ground-truth communities designed for time-variant properties of the nodes composing the selected static scenarios. The cornerstones of our synthetic graph community to be refined; (2) We apply a feed-forward prop- generator, namely DynamizeGraph, are enumerated below. agation of the refined communities’ information learnt at time step 𝑡 across the subsequent time steps. • We consider real graphs and ground-truth commu- Given a selected community 𝑐𝑡 , we revise its node com- nities as initial snapshots (at time step 1). position using a Reinforcement Learning (RL) approach. Let • We simulate the temporal evolution of the real graph 𝑁𝑐𝑡 be the subset of G 𝑡 nodes belonging to the neighbor- and its ground-truth communities by hiding or show- ing nodes, edges, or labeled communities. Table 1 Main dataset statistics: average number of vertices and edges per snapshot, the average number of ground-truth communities per snapshot (train + test), the average modularity per snapshot, and the dynamic graph stability computed according to [16]. Dataset #Nodes #Edges #Commun. Modularity Stability Real initial network and communities - Dynamics injection with high stability email-Eu-core 802 10590 40 0.31 0.97 Amazon 13465 31233 1141 0.99 0.98 Youtube 28861 124404 2373 0.22 0.97 Real initial network and communities - Dynamics injection with low stability email-Eu-core 550.6 2726.9 38.7 0.65 0.80 Amazon 9580.0 16729.9 1051.6 0.99 0.81 Youtube 16158.5 30836.1 2107.4 0.55 0.74 Purely synthetic Syn_Const 5279.3 18894.0 500.0 0.82 0.99 Syn_Growth 3489.6 12594.8 501.0 0.88 0.93 Syn_Shrink 4079.6 19735.0 500.5 0.84 0.98 • We apply graph transformations from one snap- Purely synthetic dynamic graphs We generate three shot to another that are related to either specific dynamic graphs whose snapshots have different sizes, i.e., nodes/edges (micro-operations) or entire communi- Syn_Const shows a roughly constant number of nodes per ties (macro-operations). snapshot, in Syn_Growth the number of nodes per snap- shot increases over time, whereas in Syn_Shrink shows an A Python implementation of DynamizeGraph is also avail- opposite trend. able, for research purposes, in the official project repository. 4.2. Datasets 5. Experimental results We run our experiments on nine different datasets, six of All the experiments were conducted on a single NVIDIA them are generated by DynamizeGraph starting from real Tesla V100 SXM2 GPU with 32 GB memory. graphs and labeled communities whereas the remaining ones are purely synthetic and generated by DANCer [21]. 5.1. Performance metrics Table 1 summarizes the main dataset statistics. Given the ground truth communities, we consider 50% of them as We evaluate SS-CD performance using the following three training labeled communities and the remainder 50% of established metrics: F1 score, Jaccard score, and Overlapping them as test labeled communities (see Section 5.1). Normalized Mutual Information (ONMI, in short). In all cases, we use a set of labelled test communities ˆ𝒞 𝑡𝑒𝑠𝑡 , with no intersection with the training ones ˆ𝒞 . To cope with dynamic Real graphs and communities We rely on three real scenarios all the scores are averaged over all snapshots. networks with ground-truth communities retrieved from The F1 and Jaccard scores are metrics for comparing test SNAP [17], i.e., email-Eu-core, com-Amazon, and com- ground-truth and predicted communities in SS-CD [23, 24, Youtube. The real networks are all static but show rather dif- 25, 9]. The higher the scores the more accurate the com- ferent characteristics. Specifically, email-Eu-core is a denser munity matching (in whatever direction). To more deeply yet smaller network including roughly 20 nodes per ground- analyze the impact of graph modularity on CD performance, truth community whereas com-Amazon and com-Youtube we also use alternative weighted versions of the F1 and Jac- are sparser yet much larger datasets where communities card scores. Since we are particularly interested in exploring consist of approximately 10 nodes each. In terms of network the capabilities of CD methods to leverage the informa- modularity, real graphs also significantly differ from each tion extracted from labeled communities, we weight the other: com-Amazon has a very high modularity whereas modularity m(𝑐ˆ𝑡 ) of each test community ˆ𝑐𝑡 inversely pro- email-Eu-core and com-Youtube show fairly low modularity portional to their modularity, the lower the modularity of a values. As discussed later on, the lower the modularity the test community, the lower its predictability in the absence more complex the CD task in the absence of appropriate of supervised knowledge, the higher its matching contribu- supervision. tion to the overall score. Finally, we use the Overlapping Normalized Mutual Information (OMNI) [26]. It is a rescaled Injection of network dynamics We extend the real version of the Mutual Information between the predicted static graphs and ground-truth communities under two dif- and test sets of communities at time step 𝑡. ferent dynamic settings, i.e., low or high stability. In com- pliance with [16], we define the stability as the average difference in Adjusted Mutual Information of the communi- 5.2. Baselines ties [22] between two consecutive time steps. The higher As baseline methods, we consider the following four state-of- the stability the lower the strength of the dynamics in the the-art DCD approaches extended to successfully cope with network communities across consecutive graph snapshots. dynamic graphs: DynGEM [12], CTGCN-S [27], CTGCN- We expect to achieve higher benefits from our time-aware C [27], and ROLAND [13]. Specifically, we modify the approach on dynamic graphs with relatively (but not exces- respective architectures integrating an additional cross- sively) low stability. As discussed in Section 5, the results entropy loss term for supervision driven by the labeled meet the expectation. communities. Furthermore, we also consider CLARE [9], Avg F1 score gap between our approach and CLARE which is the latest and best-performing SS-CD approach1 . _3 The key differences between the extended DCD versions 0.04 email _1 email tu _ e 1 onst and the SS-CD baseline are that (1) CLARE relies on static 0.03 You b Syn_C rowth order embeddings, and (2) SS-CD also performs community 0.02 be_3 Syn_G k Youtu refinement on top of the supervised candidate selection. in hr 0.01 Syn_S 5.3. Experimental settings 0.00 We test our approach by varying (1) The node embed- 0.01 ding strategy (we test Node2Vec [28], ROLAND [13], Dyn- 0.02 on_1 Amaz GEM [12], CTGCN-S and CTGCN-C [27]) (2) The policy 0.03 Amazon_3 network architecture in type (we test MLP, GRU, and mov- 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 ing average) and complexity (i.e., we vary the number of Average Modularity attention heads), (3) the dropout rate (between zero and one), and (4) The number of training epochs (up to 2000). Figure 3: Correlation between average graph modularity and Based on a grid search, ROLAND [13] turns out to be the performance gaps between our approach and the SS-CD state- best-performing dynamic graph encoder while a single-head of-the-art model [9]. The higher the modularity the lower the benefits of the time-aware approach. GRU the best policy network. We set the dropout rate to zero, as its impact is negligible, and the number of training epochs to 2000. For the baseline methods, we adapt the source code re- negative correlation between modularity and usefulness of leased by the papers’ authors. All the DCD baselines are the time-aware community refinement step. A quantitative trained for 50 epochs, whereas we train CLARE for 30 epochs comparison in terms of Weighted F1- and Jaccard scores to perform candidate generation and for 1000 epochs to per- on low-modularity datasets confirms our previous findings form community rewriting. We always use 16-dimension (e.g., on YouTube our average Weighted F1-Score is 1.89 embeddings for DCD models and 64-dimension order em- vs. 0.9 of CLARE). The achieved results indicate that the beddings for CLARE. time-aware approach turns out to be particularly helpful For all methods, we set the number of expected communi- when the community-level information conveyed by the ties, whenever requested as input parameter, to the number network graph solely is not enough. of labeled communities in the training data. To compute the statistics on the network graphs we use 5.5. Ablation study the NetworkX library [29]. We carry out an ablation study to quantify the effect of the choice of the node embedding strategy on dynamic graphs 5.4. Performance results characterized by variable modularity and stability levels. To Table 2 reports the F1, Jaccard, ONMI Scores achieved by our this end, we compare two alternative state-of-the-art strate- approach and the baseline methods on the test communities. gies for temporal graph embeddings, i.e., ROLAND [13] and Among the tested baselines, CLARE performs averagely best CTGCN [27]. on the analyzed datasets and settings confirming the bene- The plots in Figure 4 show the F1-score gaps observed fits of adopting the complete SS-CD pipeline. Our approach in our approach between the variants with ROLAND and outperforms all the tested baselines, including CLARE, on CTGCN graph embeddings. They respectively show the the YouTube and the synthetic dynamic graphs, it performs separate influence of graph modularity (plot on the left) and best on email-Eu-core in four out of six dataset-metric com- stability (right) on the results of the ablation study. Thanks binations, whereas ranked second behind CLARE on the to the use of hierarchical node states, ROLAND embeddings Amazon dataset. On average, it shows superior performance turn out to be more effective than CTGCN in capturing dy- on graphs with low modularity (e.g., YouTube) and better namic graph and community relations. It has shown to be suitability for fairly low-stability settings. The reason is averagely more robust to graphs with low/medium modular- that the time-aware approach provides clear benefits when ity. On datasets with extremely high values of modularity graph snapshots and labeled communities are dynamic (i.e., or stability, CTGCN outperforms ROLAND. However, as low stability values) as long as the strength of the dynamics discussed in Section 5.4, in those extreme cases adopting does not invalidate the relevance of the temporal graph rela- time-aware approaches to tackle the SS-DCD task is less tions. For instance, on the same dataset (email-Eu-core) the appealing. average F1 Score of our approach soars from 0.1931 to 0.3213 moving from a high-stability setting to a low-stability one. Conversely, on datasets like Amazon where the modularity 6. Conclusions and Future Work is high, CD based on the analysis of the network structure In this paper, we formulated a new Community Detection is already quite effective. Therefore, the benefits achieved task combining Semi-Supervision and dynamic graphs. We by semi-supervision turn out to be limited. extend the existing pipeline for SS-CD by leveraging tem- Figure 3 graphically shows the correlation between av- poral graph embeddings to capture temporal dynamics in erage graph modularity and per-dataset F1-score gaps be- the candidate community generation, selection, and refine- tween our approach and CLARE. The result confirms the ment stages. We also propagate the outcomes of two policy 1 We omit the comparisons with MARS [10] because, to the best of our networks used in the refinement stage across the subse- knowledge, the paper is currently under review and the source code is quent time steps, thus making the supervision aware of the not available yet. Table 2 Performance comparison between baselines and our methods on dynamic graphs with different characteristics. For each combination of method, dataset, and performance metric we report the average score over multiple runs and the standard deviations. The best average performance per dataset and score is highlighted in boldface. Dataset DynGEM [12] CTGCN-C [27] CTGCN-S [27] ROLAND [13] CLARE [9] Ours Real initial network and communities - Dynamics injection with high stability email-Eu-core 0.3450 ± 0.0821 0.3815 ± 0.0210 0.1706 ± 0.0167 0.1699 ± 0.0105 0.1591 ± 0.0301 0.1931 ± 0.0263 F1 Score Amazon 0.3282 ± 0.0245 0.5488 ± 0.0539 0.1607 ± 0.0183 0.1229 ± 0.0074 0.6757 ± 0.0131 0.6207 ± 0.0304 YouTube 0.2134 ± 0.0162 0.1975 ± 0.0217 0.2486 ± 0.0289 0.0867 ± 0.0084 0.4596 ± 0.0122 0.4665 ± 0.0229 email-Eu-core 0.2309 ± 0.0676 0.2547 ± 0.0204 0.0951 ± 0.0106 0.0944 ± 0.0063 0.0936 ± 0.0199 0.1169 ± 0.0218 Jaccard Amazon 0.2209 ± 0.0217 0.4390 ± 0.0566 0.0928 ± 0.0118 0.0667 ± 0.0042 0.6325 ± 0.0127 0.5699 ± 0.0325 Score YouTube 0.1349 ± 0.0125 0.1192 ± 0.0157 0.1612 ± 0.0217 0.0473 ± 0.0050 0.3980 ± 0.0113 0.4042 ± 0.0216 email-Eu-core 0.1173 ± 0.0580 0.0956 ± 0.0285 0.0042 ± 0.0055 0.0000 ± 0.0000 0.0102 ± 0.0165 0.0254 ± 0.0297 ONMI Amazon 0.0833 ± 0.0258 0.3440 ± 0.0730 0.0084 ± 0.0033 0.0002 ± 0.0004 0.6164 ± 0.0114 0.5487 ± 0.0352 YouTube 0.0330 ± 0.0093 0.0177 ± 0.0081 0.0547 ± 0.0159 0.0009 ± 0.0010 0.3596 ± 0.0104 0.3678 ± 0.0222 Real initial network and communities - Dynamics injection with low stability email-Eu-core 0.3032 ± 0.0555 0.4637 ± 0.0791 0.1979 ± 0.0242 0.2042 ± 0.0137 0.3117 ± 0.0520 0.3213 ± 0.0504 F1 Score Amazon 0.4765 ± 0.0333 0.4657 ± 0.0325 0.1873 ± 0.0320 0.1138 ± 0.0075 0.6414 ± 0.0127 0.6120 ± 0.0266 YouTube 0.1426 ± 0.0125 0.2532 ± 0.0409 0.2352 ± 0.0497 0.1041 ± 0.0203 0.4867 ± 0.0244 0.4912 ± 0.0307 email-Eu-core 0.1919 ± 0.0447 0.3432 ± 0.0802 0.1124 ± 0.0166 0.1176 ± 0.0098 0.2183 ± 0.0495 0.2206 ± 0.0509 Jaccard Amazon 0.3799 ± 0.0399 0.3580 ± 0.0316 0.1100 ± 0.0205 0.0617 ± 0.0042 0.5935 ± 0.0147 0.5560 ± 0.0324 Score YouTube 0.0864 ± 0.0093 0.1596 ± 0.0310 0.1514 ± 0.0349 0.0583 ± 0.0118 0.4252 ± 0.0245 0.4308 ± 0.0301 email-Eu-core 0.0821 ± 0.0593 0.2188 ± 0.1012 0.0036 ± 0.0075 0.0065 ± 0.0103 0.1447 ± 0.0728 0.1200 ± 0.0792 ONMI Amazon 0.3082 ± 0.0592 0.2640 ± 0.0386 0.0088 ± 0.0045 0.0008 ± 0.0010 0.5862 ± 0.0169 0.5495 ± 0.0307 YouTube 0.0147 ± 0.0066 0.0382 ± 0.0198 0.0478 ± 0.0199 0.0022 ± 0.0013 0.3944 ± 0.0312 0.4013 ± 0.0367 Purely synthetic Syn_Const 0.3056 ± 0.0132 0.4778 ± 0.0699 0.1682 ± 0.0084 0.1360 ± 0.0030 0.5947 ± 0.0295 0.6071 ± 0.0228 F1 Score Syn_Growth 0.4074 ± 0.0308 0.5908 ± 0.0869 0.2518 ± 0.0182 0.1856 ± 0.0126 0.7048 ± 0.0227 0.7134 ± 0.0183 Syn_Shrink 0.4199 ± 0.0498 0.6443 ± 0.0989 0.2227 ± 0.0245 0.1730 ± 0.0144 0.6860 ± 0.0443 0.6886 ± 0.0355 Syn_Const 0.1899 ± 0.0104 0.3480 ± 0.0641 0.0936 ± 0.0053 0.0735 ± 0.0018 0.4886 ± 0.0314 0.5064 ± 0.0271 Jaccard Syn_Growth 0.2768 ± 0.0294 0.4699 ± 0.0920 0.1489 ± 0.0130 0.1037 ± 0.0078 0.6264 ± 0.0279 0.6420 ± 0.0228 Score Syn_Shrink 0.2905 ± 0.0471 0.5382 ± 0.1138 0.1306 ± 0.0166 0.0963 ± 0, 0090 0.5973 ± 0.0547 0.6032 ± 0.0425 Syn_Const 0.0258 ± 0.0094 0.2114 ± 0.0740 0.0008 ± 0.0013 0.0000 ± 0.0000 0.4708 ± 0.0437 0.4944 ± 0.0362 ONMI Syn_Growth 0.1150 ± 0.0402 0.3763 ± 0.1277 0.0100 ± 0.0048 0.0000 ± 0.0000 0.6337 ± 0.0336 0.6493 ± 0.0247 Syn_Shrink 0.1368 ± 0.0615 0.4555 ± 0.1473 0.0090 ± 0.0053 0.0007 ± 0.0011 0.5965 ± 0.0614 0.6040 ± 0.0453 onst Syn_C onst Syn_C 0.05 0.05 h rink 0.04 Syn_S hrink 0.04 Syn_S Average F1 score gap e_3 rowth Average F1 score gap b e_3 rowth Youtu 0.03 b Syn_G Youtu 0.03 1 Syn_G 0.02 Youtube_ ail_3 l_3 Youtu b e_1 emai 0.02 em 0.01 0.01 0.00 0.00 on_3 on_1 on_3 0.01 Amaz Amaz Amaz 0.01 Amazon_1 l_1 0.02 email_1 0.02 emai 0.88 0.90 0.92 0.94 0.96 0.98 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Average Stability Average Modularity Figure 4: Gap between the F1-Scores of our approach using ROLAND and CTGCN embeddings. Effects of average graph modularity (left hand-side) and average graph stability (right). temporal relations of communities with the past, partially and (iii), and the adoption of Contrastive Learning architec- labeled, graph snapshots. The main takeaways from the tures to generate input embeddings (replacing Graph Neural empirical analysis are: (i) the effectiveness of the proposed Networks), thus limiting the complexity and need for large pipeline on low-modularity graphs and low-stability graphs, sets of annotations. (ii) the importance of the community refinement stage as in the classical SS-CD task, and (iii) the little influence of the number of nodes on computational time of the commu- Acknowledgments nity refinement stage, suggesting to optimize the cost of This work has been partially supported by the Spoke 1 "Fu- graph embedding to efficiently adopt time-aware strategies tureHPC & BigData" of ICSC - Centro Nazionale di Ricerca in large networks. in High-Performance-Computing, Big Data and Quantum Our future research agenda will encompass: (i) the exten- Computing, funded by European Union - NextGenera- sion to network graphs with node attributes, which might tionEU. also change over time, (ii) the study of scalable approaches to SS-DCD in the steps of graph embedding, candidate commu- nity generation, but also the community refinement stage, References [13] J. You, T. Du, J. Leskovec, Roland: Graph learning framework for dynamic graphs, 2022. [1] X. Su, S. Xue, F. Liu, J. Wu, J. Yang, C. Zhou, W. Hu, arXiv:2208.07239. C. Paris, S. Nepal, D. Jin, Q. Z. Sheng, P. S. Yu, A [14] Z. Wang, C. Wang, C. Gao, X. Li, X. Li, An evolution- comprehensive survey on community detection with ary autoencoder for dynamic community detection, deep learning, IEEE Transactions on Neural Net- Science China Information Sciences 63 (2020) 212205. works and Learning Systems 35 (2024) 4682–4702. URL: https://doi.org/10.1007/s11432-020-2827-9. URL: http://dx.doi.org/10.1109/TNNLS.2021.3137396. doi:10.1007/s11432-020-2827-9. doi:10.1109/tnnls.2021.3137396. [15] L. Fan, S. Xu, D. Liu, Y. Ru, Semi-supervised com- [2] P. Chunaev, Community detection in node-attributed munity detection based on distance dynamics, IEEE social networks: a survey, Computer Science Review Access 6 (2018) 37261–37271. doi:10.1109/ACCESS. 37 (2020) 100286. URL: https://www.sciencedirect.com/ 2018.2838568. science/article/pii/S1574013720303865. doi:https:// [16] G. Paoletti, L. Gioacchini, M. Mellia, L. Vassio, J. M. doi.org/10.1016/j.cosrev.2020.100286. Almeida, Benchmarking evolutionary community [3] X. Huang, D. Chen, T. Ren, D. Wang, A survey of detection algorithms in dynamic networks, in: 4th community detection methods in multilayer networks, Workshop on Graphs and more Complex structures Data Min. Knowl. Discov. 35 (2021) 1–45. URL: https: for Learning and Reasoning (GCLR) at AAAI 2024, //doi.org/10.1007/s10618-020-00716-6. doi:10.1007/ Cornell Tech, 2024, p. 1–8. URL: https://arxiv.org/abs/ s10618-020-00716-6. 2312.13784. [4] M. Rostami, M. Oussalah, K. Berahmand, V. Far- [17] J. Leskovec, A. Krevl, SNAP Datasets: Stanford large rahi, Community detection algorithms in health- network dataset collection, http://snap.stanford.edu/ care applications: A systematic review, IEEE Access data, 2014. 11 (2023) 30247–30272. doi:10.1109/ACCESS.2023. [18] A. Lancichinetti, S. Fortunato, F. Radicchi, Benchmark 3260652. graphs for testing community detection algorithms, [5] M. Girvan, M. E. J. Newman, Community structure Physical Review E 78 (2008). URL: http://dx.doi.org/10. in social and biological networks, Proceedings of the 1103/PhysRevE.78.046110. doi:10.1103/physreve. National Academy of Sciences 99 (2002) 7821–7826. 78.046110. doi:10.1073/pnas.122653799. [19] P. Vanhems, A. Barrat, C. Cattuto, J.-F. Pinton, [6] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefeb- N. Khanafer, C. Régis, B.-a. Kim, B. Comte, N. Voirin, vre, Fast unfolding of communities in large net- Estimating potential infection transmission routes works, Journal of Statistical Mechanics: Theory and in hospital wards using wearable proximity sensors, Experiment 2008 (2008) P10008. URL: http://dx.doi. PLOS ONE 8 (2013) 1–9. URL: https://doi.org/10.1371/ org/10.1088/1742-5468/2008/10/P10008. doi:10.1088/ journal.pone.0073970. doi:10.1371/journal.pone. 1742-5468/2008/10/p10008. 0073970. [7] N. Mehta, L. Carin, P. Rai, Stochastic blockmodels meet [20] R. Mastrandrea, J. Fournet, A. Barrat, Contact pat- graph neural networks, in: K. Chaudhuri, R. Salakhut- terns in a high school: A comparison between data dinov (Eds.), Proceedings of the 36th International collected using wearable sensors, contact diaries and Conference on Machine Learning, ICML 2019, 9-15 friendship surveys, PLOS ONE 10 (2015) 1–26. URL: June 2019, Long Beach, California, USA, PMLR, 2019, https://doi.org/10.1371/journal.pone.0136497. doi:10. pp. 4466–4474. 1371/journal.pone.0136497. [8] J. Yang, J. Leskovec, Overlapping communities explain [21] C. Largeron, P.-N. Mougel, O. Benyahia, O. R. Zaïane, core–periphery organization of networks, Proceed- Dancer: dynamic attributed networks with community ings of the IEEE 102 (2014) 1892–1902. doi:10.1109/ structure generation, Knowledge and Information JPROC.2014.2364018. Systems 53 (2017) 109–151. [9] X. Wu, Y. Xiong, Y. Zhang, Y. Jiao, C. Shan, Y. Sun, [22] X. V. Nguyen, J. Epps, J. Bailey, Information theo- Y. Zhu, P. S. Yu, Clare: A semi-supervised community retic measures for clusterings comparison: is a cor- detection algorithm, in: Proceedings of the 28th ACM rection for chance necessary?, in: Proceedings of the SIGKDD Conference on Knowledge Discovery and 26th Annual International Conference on Machine Data Mining, KDD ’22, Association for Computing Learning, ICML 2009, Montreal, Quebec, Canada, June Machinery, New York, NY, USA, 2022, p. 2059–2069. 14-18, 2009, volume 382, ACM, 2009, pp. 1073–1080. URL: https://doi.org/10.1145/3534678.3539370. doi:10. doi:10.1145/1553374.1553511. 1145/3534678.3539370. [23] J. Yang, J. J. McAuley, J. Leskovec, Community detec- [10] L. Haonan, L. Xiaoyu, H. Linmei, J. Li, S. Xian, Z. Lin- tion in networks with node attributes, in: H. Xiong, hao, W. Kaiwen, W. Hongqi, Mars: An iterative match- G. Karypis, B. Thuraisingham, D. J. Cook, X. Wu ing and rewriting model for semi-supervised commu- (Eds.), 2013 IEEE 13th International Conference on nity detection, Available at SSRN 4757429 (2024). Data Mining, Dallas, TX, USA, December 7-10, 2013, [11] K. Berahmand, Y. Li, Y. Xu, A deep semi-supervised IEEE Computer Society, 2013, pp. 1151–1156. URL: community detection based on point-wise mutual in- https://doi.org/10.1109/ICDM.2013.167. doi:10.1109/ formation, IEEE Transactions on Computational So- ICDM.2013.167. cial Systems 11 (2024) 3444–3456. doi:10.1109/TCSS. [24] T. Chakraborty, A. Dalmia, A. Mukherjee, N. Gan- 2023.3327810. guly, Metrics for community analysis: A survey, 2016. [12] P. Goyal, N. Kamra, X. He, Y. Liu, Dyngem: Deep arXiv:1604.03512. embedding method for dynamic graphs, CoRR [25] Y. Jia, Q. Zhang, W. Zhang, X. Wang, Communitygan: abs/1805.11273 (2018). URL: http://arxiv.org/abs/1805. Community detection with generative adversarial nets, 11273. arXiv:1805.11273. 2019. arXiv:1901.06631. [26] A. F. McDaid, D. Greene, N. Hurley, Normalized mu- mining, ACM, New York, NY, USA, 2008, pp. 462– tual information to evaluate overlapping community 470. URL: http://dx.doi.org/10.1145/1401890.1401948. finding algorithms, 2013. arXiv:1110.2515. doi:10.1145/1401890.1401948. [27] J. Liu, C. Xu, C. Yin, W. Wu, Y. Song, K-core based [29] A. A. Hagberg, D. A. Schult, P. J. Swart, Explor- temporal graph convolutional network for dynamic ing network structure, dynamics, and function us- graphs, CoRR abs/2003.09902 (2020). URL: https:// ing networkx, in: G. Varoquaux, T. Vaught, J. Mill- arxiv.org/abs/2003.09902. arXiv:2003.09902. man (Eds.), Proceedings of the 7th Python in Sci- [28] J. Leskovec, L. Backstrom, R. Kumar, A. Tomkins, Mi- ence Conference, Pasadena, CA USA, 2008, pp. 11 croscopic evolution of social networks, in: KDD – 15. URL: http://conference.scipy.org/proceedings/ ’08: Proceeding of the 14th ACM SIGKDD interna- SciPy2008/paper_2/. tional conference on Knowledge discovery and data