Semi-supervised Community Detection in Dynamic Graphs
                         Matteo Bianco, Luca Cagliero* and Luca Vassio
                         Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129 Torino TO, Italy


                                         Abstract
                                         Semi-supervised approaches to Community Detection (CD) in graphs aim to detect communities closely related to a few labeled ones.
                                         State-of-the-art semi-supervised algorithms adopt a three-step process, which entails (1) Generating candidate communities based
                                         solely on the network structure; (2) Selecting the candidates that are most similar to the labeled communities; (3) Refining the selected
                                         communities shortlisted at Step (2). However, existing approaches are unsuited to handle the dynamics in labeled communities and their
                                         relations with time-varying graph structures. In this work, we formulate the new task of semi-supervised CD from dynamic graphs,
                                         which is relevant to real-world time-evolving scenarios. To avoid executing the previous pipeline independently at every time step and
                                         potentially missing relevant temporal community-level relations, we envisage a new approach relying on time-aware strategies for both
                                         dynamic graph embedding and community selection and refinement. We leverage a latent graph representation incorporating node-
                                         and subgraph-level temporal relations neglected by static approaches. Then, supervised community refinements are propagated across
                                         consecutive time steps to capture time-evolving trends. After adapting static CD models to the dynamic scenario, we conduct extensive
                                         comparisons of the methods on datasets with varying characteristics in the novel task of dynamic semi-supervised CD. The proposed
                                         approach shows remarkable improvements in low-modularity and low-stability dynamic graphs.

                                         Keywords
                                         Community Detection, Dynamic Graphs, Semi-Supervised Learning


                         1. Introduction                                                                                            between graph snapshots and time-varying labeled commu-
                                                                                                                                    nities are jointly analyzed.
                         Communities in graphs are groups of nodes with distinctive
                         features or connections [1]. Automating the process of Com-                                                Contribution Firstly, we formalize the Semi-Supervised
                         munity Detection (CD) from graphs is a well-established                                                    Dynamic Community Detection (SS-DCD) task (Section 2).
                         machine learning problem. It has found application in sev-                                                 Unlike the unsupervised task, here the CD process is guided
                         eral fields among which social network analysis [2], scien-                                                by labeled communities. Secondly, we propose an approach
                         tometrics [3], and healthcare [4].                                                                         to tackle the newly proposed SS-DCD task (Section 3). Un-
                            To deeply explore the network graph structure, unsu-                                                    like all existing works (e.g., [9, 10]), our approach leverages
                         pervised approaches to CD have attempted to use a vari-                                                    temporal graph embeddings to incrementally update the
                         ety of classical data mining or Deep Learning techniques                                                   dynamic graph representation. Furthermore, the steps of
                         such as clustering [5], graph mining [6], and Variational                                                  supervised community selection and refinement are time-
                         AutoEncoders [7]. However, they struggle to find communi-                                                  aware. The aim is to consider not only the temporal relations
                         ties of nodes with functional relations that are not directly                                              between network graphs in the different snapshots but also
                         derivable from the network structure [8]. To tackle this                                                   the evolution of the labeled communities. Lastly, we empiri-
                         issue, Semi-Supervised CD (SS-CD) approaches leverage a                                                    cally compare the performance of our approach and baseline
                         few examples of labeled communities, typically provided                                                    methods under the unexplored SS-DCD setting (Section 5).
                         by domain experts. To efficiently address SS-CD on large
                         graphs, state-of-the-art approaches (e.g., [9, 10, 11]) first
                                                                                                                                    Evaluation We adapt real static graphs with ground-truth
                         encode graph nodes or subgraphs using graph embedding
                                                                                                                                    communities and various topological characteristics to a dy-
                         techniques. Next, they extract candidate communities based
                                                                                                                                    namic scenario (Section 4). On top of dynamic graphs, we
                         solely on the network structure. Finally, they shortlist and
                                                                                                                                    assess both our approach and baseline methods in terms of
                         refine the extracted candidates by maximizing their similar-
                                                                                                                                    the established F1, Jaccard, and ONMI performance scores.
                         ity with the labeled communities in the embedding space.
                                                                                                                                    The results show that our approach performs on average
                            Since real-world communities are naturally time-
                                                                                                                                    the best on the analyzed datasets in terms of F1, Jaccard, and
                         evolving, there is an increasing need to extend existing CD
                                                                                                                                    ONMI scores (19 wins out of 27 combinations of datasets
                         solutions suited to static graphs towards dynamic scenarios.
                                                                                                                                    and metrics). The results are mainly influenced by the net-
                         Recent approaches to CD capture time-evolving trends in
                                                                                                                                    work modularity and the level of dynamics in the sequence
                         dynamic graphs by learning temporal graph embeddings.
                                                                                                                                    of graph snapshots and corresponding labels. Specifically,
                         They either learn temporal relations in sequences of graph
                                                                                                                                    when the graph has a high modularity the communities are
                         snapshots [12, 13, 14] or rely on parametric distance dy-
                                                                                                                                    relatively easy to detect from the current network topology
                         namic models [15]. However, to the best of our knowledge,
                                                                                                                                    regardless of its past snapshots and labeled communities.
                         all prior works on CD from dynamic graphs are unsuited
                                                                                                                                    The experiments confirm that the lower the modularity
                         to a semi-supervised scenario where labeled communities
                                                                                                                                    the higher the benefits of the newly proposed time-aware
                         change over time. This calls for new approaches addressing
                                                                                                                                    semi-supervised strategy. Similarly, the temporal stability
                         SS-CD from dynamic graphs, in which temporal relations
                                                                                                                                    of the network [16] is another important indicator of the
                                                                                                                                    level of complexity of the SS-DCD task. Our results confirm
                         Published in the Proceedings of the Workshops of the EDBT/ICDT 2025                                        the better suitability of the proposed approach to dynamic
                         Joint Conference (March 25-28, 2025), Barcelona, Spain                                                     scenarios compared to state-of-the-art approaches.
                         *
                           Corresponding author.
                         $ matteo.bianco@studenti.polito.it (M. Bianco); luca.vassio@polito.it
                         (L. Cagliero); luca.vassio@polito.it (L. Vassio)
                          0000-0002-7185-5247 (L. Cagliero); 0000-0002-2920-1856 (L. Vassio)
                                     © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                     Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Problem statement                                                     1. Candidate community generation, aimed to extract
                                                                            candidate communities based on the structure of the
We first introduce the preliminary concepts and related nota-               network graph.
tion and then formalize the newly proposed Semi-Supervised               2. Supervised candidate selection, which reduces the set
Dynamic Community Detection (SS-DCD) task.                                  of candidates generated at the previous step accord-
   Let G 𝑡 =(𝑉 𝑡 ,𝐸 𝑡 ) be a graph where 𝑉 𝑡 and 𝐸 𝑡 are the                ing to their similarity with the labeled communities.
corresponding sets of nodes and edges, respectively. A com-              3. Community refinement, aimed to revise the gener-
munity c 𝑡 detected from G 𝑡 is a subset of the nodes with                  ated communities, e.g., by dropping or adding nodes
peculiar characteristics in terms of either network graph                   or edges.
structure or node properties. Given a graph G 𝑡 , the Unsu-
pervised Community Detection (U-CD) task has the goal of                However, all the above-mentioned steps ignore the tem-
extracting the set C 𝑡 of all its communities without any            poral evolution of both graph structure and communities
prior knowledge on the community properties, i.e., U-CD              which is peculiar to the newly proposed SS-DCD task.
exclusively relies on the network graph structure. When,                Figure 1 shows how to naively use the classical SS-CD
instead, the extraction process is guided by a given set C     ^𝑡    pipeline in an SS-DCD task. For the sake of simplicity, we
                                      𝑡
of labeled communities in G , the task is commonly de-               exemplify the CD process across two consecutive time steps
noted by Supervised Community Detection (S-CD). The key              only, namely 𝑡 − 1 and 𝑡. The temporal graph snapshots
idea behind S-CD is to detect communities in graphs that             G 𝑡−1 and G 𝑡 are separately embedded to generate the can-
are closely related to the labeled ones by learning a model          didate communities. Next, the candidate selection and re-
that automatically captures similarities among subgraphs.            finement processes are executed independently for every
Importantly, S-CD is not necessarily based on the network            time step. This existing pipeline has two main drawbacks:
structure solely but might consider functional properties of
nodes or edges as well [8]. In this work, we assume that the              • The process of generation of the candidate communi-
CD task is partially supervised, i.e., the input set of labeled             ties disregards the temporal relations among graph
communities is incomplete. Given a graph G 𝑡 and a train-                   snapshots and related communities. Consequently,
ing set C^ 𝑡 consisting of few labeled communities in G 𝑡 , we              the next supervised candidate selection step could
                                                                            be biased.
denote by Semi-Supervised Community Detection (SS-CD)
the task of automatically detecting all the communities in                • The community refinement step is unaware of the
G 𝑡.                                                                        outcomes of the supervised community detection
   We aim to model the temporal variations of a graph struc-                and refinement steps. Hence, it potentially disre-
ture and its underlying communities within a reference time                 gards all past (labeled or detected) community up-
period. Without loss of generality, we assume that the ref-                 dates as well as the temporal relations with the sur-
erence period is divided into 𝑇 discrete consecutive time                   rounding network.
steps (i.e., 𝑡 ∈ {1, 2, . . . , 𝑇 }).
                                                                        Let us consider, for example, the labeled community ˆ𝑐𝑡−1
   Dynamic Community Detection (DCD) aims to detect
                                                                     at time step 𝑡 − 1 consisting of nodes 𝑣1 and 𝑣3 (𝑐ˆ𝑡−1 =
all communities from sequences of graph snapshots 𝒢 =
                                                                     {𝑣1 , 𝑣3 }). The path connecting 𝑣1 and 𝑣3 changes at time
{G 𝑡 }𝑇𝑡=1 . Unsupervised approaches to DCD extract commu-
                                                                     step 𝑡 by including also node 𝑣4 . Knowing both the past
nities from G 𝑡 for every 𝑡 ∈ {1, 2, . . . , 𝑇 } in the absence of
                                                                     community composition and the new network structure
labeled communities. Conversely, Supervised DCD (S-DCD)
                                                                     allows the CD algorithm to learn the temporal relation with
leverages the set C   ^ 𝑡 of labeled communities occurring in
                                                                     its updated version ˆ𝑐𝑡 = {𝑣1 , 𝑣3 , 𝑣4 }. Analogously, the
each time step 𝑡. However, similar to the time-invariant             exclusion of node 𝑣2 from ˆ𝑐𝑡−1 is a valuable hint for the
case, the set of labeled communities is likely incomplete            definition of its updated version at time step 𝑡.
due to the lack of human annotations or reliable community              To overcome the aforesaid limitations, we propose a new
descriptors. To tackle the above issue, we formalize the             SS-DCD pipeline (see Figure 2). Compared to the standard
new task of Semi-Supervised Dynamic Community Detec-                 pipeline, we incorporate the following two additional fea-
tion (SS-DCD, in short). The twofold aims are, on the one            tures making the SS-CD pipeline end-to-end time-aware.
hand, to make SS-CD time-aware by leveraging CD semi-
supervision and, on the other hand, to enrich Unsupervised
                                                                     Time-aware graph embeddings Rather than learning
DCD with a combined analysis of the properties of both
                                                                     independent embedding matrices for every snapshot 𝑡 in
the network structure and the labeled communities across
                                                                     𝒢, we compute a time-aware embedding representation
different snapshots.
                                                                     H𝑡 of the previous graph snapshots G 1 , . . ., G 𝑡 capturing
                                                                     time-varying properties of the network graph structure.
SS-DCD task formulation Given a sequence of graph                    The embedding representation is incrementally updated
snapshots 𝒢 and a partial set C ^ 𝑡 of labeled communities           at each time step and, importantly, is directly fed to the
occurring at every time step 𝑡 ∈ {1, 2, . . . , 𝑇 }, the SS-DCD      supervised candidate selection step to allow time-aware
task aims is to extract the sequence 𝒞 = {C 𝑡 }𝑇𝑡=1 of com-          semi-supervision (see the time-aware supervised candidate
munity sets C 1 , . . . , C 𝑇 .                                      selection step in Figure 2).
                                                                        The designed SS-DCD pipeline is agnostic to the encoder
                                                                     used to compute the embedding matrix. For incremental
3. Semi-supervised Community                                         learning of node embeddings, our current implementation
   Detection from Dynamic Graphs                                     leverages the ROLAND node embeddings [13] (more details
                                                                     are given in Section 5.3).
The classical SS-CD pipeline entails the following steps:
                     Refined                                                             Refined
                                                                                                                                  hood of 𝑐𝑡 . We define the initial state of the RL as the union
                   Communities                                                         Communities                                of the community with its neighborhood (𝑐𝑡 ∪ 𝑁𝑐𝑡 ). The
                          Ct-1                                                               Ct
                                                                                                                                  state representation consists of the time-aware embeddings
                                                                                                                                  of the composing nodes, i.e., H𝑐𝑡 ∪𝑁𝑐𝑡
                                                                                                                                     Similar to CLARE [9], we define two policy networks,
              Community refinement                                             Community refinement
                                                                                                                                  consisting of separate MultiLayer Perceptrons, to decide
                        Selected                                                        Selected                                  whether to execute any of the following refinement actions
                       Communities                                                     Communities
                          Ct-1                                                             Ct                                     on the community 𝑐𝑡 :

                                                                                                                                       • Expansion, which adds a new node to the community
                                                                                                                                         taken from its neighborhood;
         Supervised candidate selection                                    Supervised candidate selection
                                                                                                                                       • Reduction, which excludes nodes that are already
                        Candidate                                                       Candidate                                        part of the community,
                       Communities                                                     Communities
                          C̃ t-1                                                           C̃ t
                                                                                                                                     Each action is rewarded according to the F1-Score im-
                       Embedding                                                       Embedding                                  provement achieved compared to the previous iteration.
                         matrix                                                          matrix
                                                                                                                                     The ExpMLP network returns the probabilities of adding
                                                                             v2                     v3          c
                                                                                                                    t             to 𝑐𝑡 a new node from the community neighborhood 𝑁𝑐𝑡 ,
             v2                  v3
                                                                                                                        v5        whereas ReductMLP estimates the likelihood of removing
                                               v5
                                                                                                                                  any node from 𝑐𝑡 . To make the community refinement
                                                                                       v1            v4
              v1                  v4                                                                                              process time-aware, both networks are fed with the time-
       t-1
   c
        snapshot Gt-1                                   Elapsing time                             snapshot Gt
                                                                                                                                  aware embeddings. The action probability distributions are
                                                                                                                                  defined as follows.

Figure 1: Separate application of the existing SS-CD pipeline to                                                                                                       (︁                    )︁
two consecutive time steps.                                                                                                       𝑃 (action=reduction of 𝑐𝑡 ) = softmax ReductMLP(H𝑐𝑡 ∪𝑁𝑐𝑡 )

                                                                                                                                                                       (︁                )︁
                                                          Refined
                                                        Communities
                                                                                                                                  𝑃 (action=expansion of 𝑐𝑡 ) = softmax ExpMLP(H𝑐𝑡 ∪𝑁𝑐𝑡 )
                                                            C


                                                                                                                                     The aforesaid probabilities distributions are propagated
                                                                                                                                  to the supervised community selection stage to initialize
                                                    Community refinement                             Refinements                  the CD process at the next time step (see the refinements’
                                                                                                     feedforward
                                                                                                     propagation                  feedforward propagation arrow in Figure 2).
                                                         Selected
                                                        Communities
                                                            C

                                                                                                                                  4. Dynamic graphs
                                                                                                                                  In this section, we introduce the generator used to create
                                           Time-aware supervised candidate selection
                                                                                                                                  dynamic graphs with time-evolving labeled communities
                                                                                                                                  and the datasets used in the experiments.
                                                           Candidate
                                                          Communities
                                                              C̃

                                                                                                                                  4.1. Synthetic Generator of Dynamic
                                                           Temporal
                                                          Embeddings
                                                                                                                                       Graphs with Labeled Communities
                                                             matrix
              v2                  v3
                                                                                  v2                      v3
                                                                                                                        Ct        CD algorithms are commonly tested on both real-world
                                                v5
                                                                                                                             v5   and synthetic data [1]. Real benchmarks including ground-
                  v1                  v4                Incremental updating                                                      truth communities (e.g., [17, 18]) are mostly static. Few of
                                                                                            v1             v4
   Ct-1                                                                                                                           them consist of dynamic graphs with annotations (e.g., [19,
          snapshot Gt-1                                    Elapsing time                             snapshot Gt
                                                                                                                                  20]) but include a relatively low number of ground-truth
                                                                                                                                  communities (< 10) and nodes (on average < 100) thus
Figure 2: The proposed SS-DCD pipeline applied to two consec-                                                                     hindering their use for testing Deep Learning techniques.
utive time steps.                                                                                                                 Synthetic generators (e.g., [21, 16]) provide ground-truth
                                                                                                                                  communities generated by reference external models, thus
                                                                                                                                  making the process of semi-supervision unrealistic.
Feed-Forward propagation of community refinements                                                                                    To bridge the gap, we extend a synthetic generator of
To tailor community refinement to a dynamic scenario (1)                                                                          dynamic graphs [21] to simulate the temporal variations
We leverage time-aware embeddings to also consider the                                                                            of real graphs and ground-truth communities designed for
time-variant properties of the nodes composing the selected                                                                       static scenarios. The cornerstones of our synthetic graph
community to be refined; (2) We apply a feed-forward prop-                                                                        generator, namely DynamizeGraph, are enumerated below.
agation of the refined communities’ information learnt at
time step 𝑡 across the subsequent time steps.                                                                                          • We consider real graphs and ground-truth commu-
   Given a selected community 𝑐𝑡 , we revise its node com-                                                                               nities as initial snapshots (at time step 1).
position using a Reinforcement Learning (RL) approach. Let                                                                             • We simulate the temporal evolution of the real graph
𝑁𝑐𝑡 be the subset of G 𝑡 nodes belonging to the neighbor-                                                                                and its ground-truth communities by hiding or show-
                                                                                                                                         ing nodes, edges, or labeled communities.
     Table 1
     Main dataset statistics: average number of vertices and edges per snapshot, the average number of ground-truth communities
     per snapshot (train + test), the average modularity per snapshot, and the dynamic graph stability computed according to [16].

                             Dataset            #Nodes     #Edges      #Commun.        Modularity       Stability
                             Real initial network and communities - Dynamics injection with high stability
                             email-Eu-core           802    10590               40             0.31          0.97
                             Amazon                13465    31233             1141             0.99          0.98
                             Youtube               28861   124404             2373             0.22          0.97
                             Real initial network and communities - Dynamics injection with low stability
                             email-Eu-core         550.6   2726.9             38.7             0.65          0.80
                             Amazon               9580.0  16729.9          1051.6              0.99          0.81
                             Youtube             16158.5  30836.1          2107.4              0.55          0.74
                             Purely synthetic
                             Syn_Const           5279.3    18894.0            500.0             0.82         0.99
                             Syn_Growth          3489.6    12594.8            501.0             0.88         0.93
                             Syn_Shrink          4079.6    19735.0            500.5             0.84         0.98


     • We apply graph transformations from one snap-                       Purely synthetic dynamic graphs We generate three
       shot to another that are related to either specific                 dynamic graphs whose snapshots have different sizes, i.e.,
       nodes/edges (micro-operations) or entire communi-                   Syn_Const shows a roughly constant number of nodes per
       ties (macro-operations).                                            snapshot, in Syn_Growth the number of nodes per snap-
                                                                           shot increases over time, whereas in Syn_Shrink shows an
  A Python implementation of DynamizeGraph is also avail-                  opposite trend.
able, for research purposes, in the official project repository.

4.2. Datasets
                                                                           5. Experimental results
We run our experiments on nine different datasets, six of                  All the experiments were conducted on a single NVIDIA
them are generated by DynamizeGraph starting from real                     Tesla V100 SXM2 GPU with 32 GB memory.
graphs and labeled communities whereas the remaining
ones are purely synthetic and generated by DANCer [21].                    5.1. Performance metrics
Table 1 summarizes the main dataset statistics. Given the
ground truth communities, we consider 50% of them as                       We evaluate SS-CD performance using the following three
training labeled communities and the remainder 50% of                      established metrics: F1 score, Jaccard score, and Overlapping
them as test labeled communities (see Section 5.1).                        Normalized Mutual Information (ONMI, in short). In all cases,
                                                                           we use a set of labelled test communities ˆ𝒞 𝑡𝑒𝑠𝑡 , with no
                                                                           intersection with the training ones ˆ𝒞 . To cope with dynamic
Real graphs and communities We rely on three real
                                                                           scenarios all the scores are averaged over all snapshots.
networks with ground-truth communities retrieved from
                                                                              The F1 and Jaccard scores are metrics for comparing test
SNAP [17], i.e., email-Eu-core, com-Amazon, and com-
                                                                           ground-truth and predicted communities in SS-CD [23, 24,
Youtube. The real networks are all static but show rather dif-
                                                                           25, 9]. The higher the scores the more accurate the com-
ferent characteristics. Specifically, email-Eu-core is a denser
                                                                           munity matching (in whatever direction). To more deeply
yet smaller network including roughly 20 nodes per ground-
                                                                           analyze the impact of graph modularity on CD performance,
truth community whereas com-Amazon and com-Youtube
                                                                           we also use alternative weighted versions of the F1 and Jac-
are sparser yet much larger datasets where communities
                                                                           card scores. Since we are particularly interested in exploring
consist of approximately 10 nodes each. In terms of network
                                                                           the capabilities of CD methods to leverage the informa-
modularity, real graphs also significantly differ from each
                                                                           tion extracted from labeled communities, we weight the
other: com-Amazon has a very high modularity whereas
                                                                           modularity m(𝑐ˆ𝑡 ) of each test community ˆ𝑐𝑡 inversely pro-
email-Eu-core and com-Youtube show fairly low modularity
                                                                           portional to their modularity, the lower the modularity of a
values. As discussed later on, the lower the modularity the
                                                                           test community, the lower its predictability in the absence
more complex the CD task in the absence of appropriate
                                                                           of supervised knowledge, the higher its matching contribu-
supervision.
                                                                           tion to the overall score. Finally, we use the Overlapping
                                                                           Normalized Mutual Information (OMNI) [26]. It is a rescaled
Injection of network dynamics We extend the real                           version of the Mutual Information between the predicted
static graphs and ground-truth communities under two dif-                  and test sets of communities at time step 𝑡.
ferent dynamic settings, i.e., low or high stability. In com-
pliance with [16], we define the stability as the average
difference in Adjusted Mutual Information of the communi-                  5.2. Baselines
ties [22] between two consecutive time steps. The higher                   As baseline methods, we consider the following four state-of-
the stability the lower the strength of the dynamics in the                the-art DCD approaches extended to successfully cope with
network communities across consecutive graph snapshots.                    dynamic graphs: DynGEM [12], CTGCN-S [27], CTGCN-
We expect to achieve higher benefits from our time-aware                   C [27], and ROLAND [13]. Specifically, we modify the
approach on dynamic graphs with relatively (but not exces-                 respective architectures integrating an additional cross-
sively) low stability. As discussed in Section 5, the results              entropy loss term for supervision driven by the labeled
meet the expectation.
communities. Furthermore, we also consider CLARE [9],


                                                                            Avg F1 score gap between our approach and CLARE
which is the latest and best-performing SS-CD approach1 .                                                                                                                     _3
The key differences between the extended DCD versions                                                                         0.04                email
                                                                                                                                                       _1              email
                                                                                                                                              tu _
                                                                                                                                                 e 1                                            onst
and the SS-CD baseline are that (1) CLARE relies on static                                                                    0.03         You
                                                                                                                                                b                                         Syn_C
                                                                                                                                                                                                      rowth
order embeddings, and (2) SS-CD also performs community
                                                                                                                              0.02
                                                                                                                                                                         be_3                 Syn_G k
                                                                                                                                                                   Youtu
refinement on top of the supervised candidate selection.                                                                                                                                            in
                                                                                                                                                                                                 hr
                                                                                                                              0.01                                                         Syn_S
5.3. Experimental settings                                                                                                    0.00

We test our approach by varying (1) The node embed-                                                                           0.01
ding strategy (we test Node2Vec [28], ROLAND [13], Dyn-                                                                       0.02
                                                                                                                                                                                                            on_1
                                                                                                                                                                                                       Amaz
GEM [12], CTGCN-S and CTGCN-C [27]) (2) The policy
                                                                                                                              0.03                                                                     Amazon_3
network architecture in type (we test MLP, GRU, and mov-
                                                                                                                                     0.2         0.3   0.4   0.5    0.6       0.7   0.8       0.9      1.0
ing average) and complexity (i.e., we vary the number of                                                                                                     Average Modularity
attention heads), (3) the dropout rate (between zero and
one), and (4) The number of training epochs (up to 2000).                   Figure 3: Correlation between average graph modularity and
Based on a grid search, ROLAND [13] turns out to be the                     performance gaps between our approach and the SS-CD state-
best-performing dynamic graph encoder while a single-head                   of-the-art model [9]. The higher the modularity the lower the
                                                                            benefits of the time-aware approach.
GRU the best policy network. We set the dropout rate to
zero, as its impact is negligible, and the number of training
epochs to 2000.
   For the baseline methods, we adapt the source code re-                   negative correlation between modularity and usefulness of
leased by the papers’ authors. All the DCD baselines are                    the time-aware community refinement step. A quantitative
trained for 50 epochs, whereas we train CLARE for 30 epochs                 comparison in terms of Weighted F1- and Jaccard scores
to perform candidate generation and for 1000 epochs to per-                 on low-modularity datasets confirms our previous findings
form community rewriting. We always use 16-dimension                        (e.g., on YouTube our average Weighted F1-Score is 1.89
embeddings for DCD models and 64-dimension order em-                        vs. 0.9 of CLARE). The achieved results indicate that the
beddings for CLARE.                                                         time-aware approach turns out to be particularly helpful
   For all methods, we set the number of expected communi-                  when the community-level information conveyed by the
ties, whenever requested as input parameter, to the number                  network graph solely is not enough.
of labeled communities in the training data.
   To compute the statistics on the network graphs we use                   5.5. Ablation study
the NetworkX library [29].
                                                                            We carry out an ablation study to quantify the effect of the
                                                                            choice of the node embedding strategy on dynamic graphs
5.4. Performance results
                                                                            characterized by variable modularity and stability levels. To
Table 2 reports the F1, Jaccard, ONMI Scores achieved by our                this end, we compare two alternative state-of-the-art strate-
approach and the baseline methods on the test communities.                  gies for temporal graph embeddings, i.e., ROLAND [13] and
Among the tested baselines, CLARE performs averagely best                   CTGCN [27].
on the analyzed datasets and settings confirming the bene-                     The plots in Figure 4 show the F1-score gaps observed
fits of adopting the complete SS-CD pipeline. Our approach                  in our approach between the variants with ROLAND and
outperforms all the tested baselines, including CLARE, on                   CTGCN graph embeddings. They respectively show the
the YouTube and the synthetic dynamic graphs, it performs                   separate influence of graph modularity (plot on the left) and
best on email-Eu-core in four out of six dataset-metric com-                stability (right) on the results of the ablation study. Thanks
binations, whereas ranked second behind CLARE on the                        to the use of hierarchical node states, ROLAND embeddings
Amazon dataset. On average, it shows superior performance                   turn out to be more effective than CTGCN in capturing dy-
on graphs with low modularity (e.g., YouTube) and better                    namic graph and community relations. It has shown to be
suitability for fairly low-stability settings. The reason is                averagely more robust to graphs with low/medium modular-
that the time-aware approach provides clear benefits when                   ity. On datasets with extremely high values of modularity
graph snapshots and labeled communities are dynamic (i.e.,                  or stability, CTGCN outperforms ROLAND. However, as
low stability values) as long as the strength of the dynamics               discussed in Section 5.4, in those extreme cases adopting
does not invalidate the relevance of the temporal graph rela-               time-aware approaches to tackle the SS-DCD task is less
tions. For instance, on the same dataset (email-Eu-core) the                appealing.
average F1 Score of our approach soars from 0.1931 to 0.3213
moving from a high-stability setting to a low-stability one.
Conversely, on datasets like Amazon where the modularity                    6. Conclusions and Future Work
is high, CD based on the analysis of the network structure
                                                                            In this paper, we formulated a new Community Detection
is already quite effective. Therefore, the benefits achieved
                                                                            task combining Semi-Supervision and dynamic graphs. We
by semi-supervision turn out to be limited.
                                                                            extend the existing pipeline for SS-CD by leveraging tem-
   Figure 3 graphically shows the correlation between av-
                                                                            poral graph embeddings to capture temporal dynamics in
erage graph modularity and per-dataset F1-score gaps be-
                                                                            the candidate community generation, selection, and refine-
tween our approach and CLARE. The result confirms the
                                                                            ment stages. We also propagate the outcomes of two policy
1
    We omit the comparisons with MARS [10] because, to the best of our      networks used in the refinement stage across the subse-
    knowledge, the paper is currently under review and the source code is   quent time steps, thus making the supervision aware of the
    not available yet.
                       Table 2
                       Performance comparison between baselines and our methods on dynamic graphs with different characteristics. For each
                       combination of method, dataset, and performance metric we report the average score over multiple runs and the standard
                       deviations. The best average performance per dataset and score is highlighted in boldface.
                                    Dataset                  DynGEM [12]              CTGCN-C [27]               CTGCN-S [27]                           ROLAND [13]                     CLARE [9]                  Ours
         Real initial network and communities - Dynamics injection with high stability
                         email-Eu-core    0.3450 ± 0.0821           0.3815 ± 0.0210                            0.1706 ± 0.0167                        0.1699 ± 0.0105             0.1591 ± 0.0301             0.1931 ± 0.0263
         F1 Score        Amazon           0.3282 ± 0.0245            0.5488 ± 0.0539                           0.1607 ± 0.0183                        0.1229 ± 0.0074            0.6757 ± 0.0131              0.6207 ± 0.0304
                         YouTube          0.2134 ± 0.0162            0.1975 ± 0.0217                           0.2486 ± 0.0289                        0.0867 ± 0.0084             0.4596 ± 0.0122            0.4665 ± 0.0229
                         email-Eu-core    0.2309 ± 0.0676           0.2547 ± 0.0204                            0.0951 ± 0.0106                        0.0944 ± 0.0063             0.0936 ± 0.0199             0.1169 ± 0.0218
         Jaccard
                         Amazon           0.2209 ± 0.0217            0.4390 ± 0.0566                           0.0928 ± 0.0118                        0.0667 ± 0.0042            0.6325 ± 0.0127              0.5699 ± 0.0325
         Score
                         YouTube          0.1349 ± 0.0125            0.1192 ± 0.0157                           0.1612 ± 0.0217                        0.0473 ± 0.0050             0.3980 ± 0.0113            0.4042 ± 0.0216
                         email-Eu-core    0.1173 ± 0.0580            0.0956 ± 0.0285                           0.0042 ± 0.0055                        0.0000 ± 0.0000             0.0102 ± 0.0165            0.0254 ± 0.0297
         ONMI            Amazon           0.0833 ± 0.0258            0.3440 ± 0.0730                           0.0084 ± 0.0033                        0.0002 ± 0.0004            0.6164 ± 0.0114              0.5487 ± 0.0352
                         YouTube          0.0330 ± 0.0093            0.0177 ± 0.0081                           0.0547 ± 0.0159                        0.0009 ± 0.0010             0.3596 ± 0.0104            0.3678 ± 0.0222
         Real initial network and communities - Dynamics injection with low stability
                         email-Eu-core    0.3032 ± 0.0555           0.4637 ± 0.0791                            0.1979 ± 0.0242                        0.2042 ± 0.0137             0.3117 ± 0.0520            0.3213 ± 0.0504
         F1 Score        Amazon           0.4765 ± 0.0333            0.4657 ± 0.0325                           0.1873 ± 0.0320                        0.1138 ± 0.0075            0.6414 ± 0.0127              0.6120 ± 0.0266
                         YouTube          0.1426 ± 0.0125            0.2532 ± 0.0409                           0.2352 ± 0.0497                        0.1041 ± 0.0203             0.4867 ± 0.0244            0.4912 ± 0.0307
                         email-Eu-core    0.1919 ± 0.0447           0.3432 ± 0.0802                            0.1124 ± 0.0166                        0.1176 ± 0.0098             0.2183 ± 0.0495            0.2206 ± 0.0509
         Jaccard
                         Amazon           0.3799 ± 0.0399            0.3580 ± 0.0316                           0.1100 ± 0.0205                        0.0617 ± 0.0042            0.5935 ± 0.0147              0.5560 ± 0.0324
         Score
                         YouTube          0.0864 ± 0.0093            0.1596 ± 0.0310                           0.1514 ± 0.0349                        0.0583 ± 0.0118             0.4252 ± 0.0245            0.4308 ± 0.0301
                         email-Eu-core    0.0821 ± 0.0593           0.2188 ± 0.1012                            0.0036 ± 0.0075                        0.0065 ± 0.0103             0.1447 ± 0.0728             0.1200 ± 0.0792
         ONMI            Amazon           0.3082 ± 0.0592            0.2640 ± 0.0386                           0.0088 ± 0.0045                        0.0008 ± 0.0010            0.5862 ± 0.0169              0.5495 ± 0.0307
                         YouTube          0.0147 ± 0.0066            0.0382 ± 0.0198                           0.0478 ± 0.0199                        0.0022 ± 0.0013             0.3944 ± 0.0312            0.4013 ± 0.0367
         Purely synthetic
                        Syn_Const                          0.3056 ± 0.0132         0.4778 ± 0.0699             0.1682 ± 0.0084                        0.1360 ± 0.0030            0.5947 ± 0.0295             0.6071 ± 0.0228
         F1 Score       Syn_Growth                         0.4074 ± 0.0308         0.5908 ± 0.0869             0.2518 ± 0.0182                        0.1856 ± 0.0126            0.7048 ± 0.0227             0.7134 ± 0.0183
                        Syn_Shrink                         0.4199 ± 0.0498         0.6443 ± 0.0989             0.2227 ± 0.0245                        0.1730 ± 0.0144            0.6860 ± 0.0443             0.6886 ± 0.0355
                        Syn_Const                          0.1899 ± 0.0104         0.3480 ± 0.0641             0.0936 ± 0.0053                        0.0735 ± 0.0018            0.4886 ± 0.0314             0.5064 ± 0.0271
         Jaccard
                        Syn_Growth                         0.2768 ± 0.0294         0.4699 ± 0.0920             0.1489 ± 0.0130                        0.1037 ± 0.0078            0.6264 ± 0.0279             0.6420 ± 0.0228
         Score
                        Syn_Shrink                         0.2905 ± 0.0471         0.5382 ± 0.1138             0.1306 ± 0.0166                        0.0963 ± 0, 0090           0.5973 ± 0.0547             0.6032 ± 0.0425
                        Syn_Const                          0.0258 ± 0.0094         0.2114 ± 0.0740             0.0008 ± 0.0013                        0.0000 ± 0.0000            0.4708 ± 0.0437             0.4944 ± 0.0362
         ONMI           Syn_Growth                         0.1150 ± 0.0402         0.3763 ± 0.1277             0.0100 ± 0.0048                        0.0000 ± 0.0000            0.6337 ± 0.0336             0.6493 ± 0.0247
                        Syn_Shrink                         0.1368 ± 0.0615         0.4555 ± 0.1473             0.0090 ± 0.0053                        0.0007 ± 0.0011            0.5965 ± 0.0614             0.6040 ± 0.0453


                                                                                                     onst
                                                                                               Syn_C                                                                                                                                onst
                                                                                                                                                                                                                              Syn_C
                       0.05
                                                                                                                                               0.05
                                                                                                      h rink
                       0.04
                                                                                                Syn_S                                                                                                                         hrink
                                                                                                                                               0.04
                                                                                                                                                                                                                   Syn_S
Average F1 score gap


                                                                             e_3                            rowth
                                                                                                                        Average F1 score gap


                                                                          b                                                                                       e_3                         rowth
                                                                    Youtu
                       0.03                                                                                                                                   b
                                                                                                   Syn_G                                                Youtu
                                                                                                                                               0.03
                                             1                                                                                                                                          Syn_G
                       0.02         Youtube_                                  ail_3                                                                                           l_3                             Youtu
                                                                                                                                                                                                                    b   e_1
                                                                                                                                                                          emai
                                                                                                                                               0.02
                                                                           em
                       0.01                                                                                                                    0.01

                       0.00                                                                                                                    0.00
                                                                                                                                                                                 on_3                                         on_1
                                                                                                                    on_3                       0.01
                                                                                                                                                                          Amaz                                     Amaz
                                                                                                               Amaz
                       0.01
                                                                                                               Amazon_1
                                                  l_1                                                                                          0.02                                                           email_1
                       0.02                   emai
                                                                                                                                                      0.88         0.90     0.92           0.94       0.96       0.98
                              0.2         0.3        0.4      0.5    0.6        0.7      0.8      0.9          1.0                                                           Average Stability
                                                              Average Modularity
                               Figure 4: Gap between the F1-Scores of our approach using ROLAND and CTGCN embeddings. Effects of average graph
                               modularity (left hand-side) and average graph stability (right).


temporal relations of communities with the past, partially                                                           and (iii), and the adoption of Contrastive Learning architec-
labeled, graph snapshots. The main takeaways from the                                                                tures to generate input embeddings (replacing Graph Neural
empirical analysis are: (i) the effectiveness of the proposed                                                        Networks), thus limiting the complexity and need for large
pipeline on low-modularity graphs and low-stability graphs,                                                          sets of annotations.
(ii) the importance of the community refinement stage as
in the classical SS-CD task, and (iii) the little influence of
the number of nodes on computational time of the commu-                                                              Acknowledgments
nity refinement stage, suggesting to optimize the cost of
                                                                                                                     This work has been partially supported by the Spoke 1 "Fu-
graph embedding to efficiently adopt time-aware strategies
                                                                                                                     tureHPC & BigData" of ICSC - Centro Nazionale di Ricerca
in large networks.
                                                                                                                     in High-Performance-Computing, Big Data and Quantum
    Our future research agenda will encompass: (i) the exten-
                                                                                                                     Computing, funded by European Union - NextGenera-
sion to network graphs with node attributes, which might
                                                                                                                     tionEU.
also change over time, (ii) the study of scalable approaches to
SS-DCD in the steps of graph embedding, candidate commu-
nity generation, but also the community refinement stage,
References                                                      [13] J. You, T. Du, J. Leskovec, Roland: Graph
                                                                     learning framework for dynamic graphs, 2022.
 [1] X. Su, S. Xue, F. Liu, J. Wu, J. Yang, C. Zhou, W. Hu,          arXiv:2208.07239.
     C. Paris, S. Nepal, D. Jin, Q. Z. Sheng, P. S. Yu, A       [14] Z. Wang, C. Wang, C. Gao, X. Li, X. Li, An evolution-
     comprehensive survey on community detection with                ary autoencoder for dynamic community detection,
     deep learning, IEEE Transactions on Neural Net-                 Science China Information Sciences 63 (2020) 212205.
     works and Learning Systems 35 (2024) 4682–4702.                 URL:        https://doi.org/10.1007/s11432-020-2827-9.
     URL: http://dx.doi.org/10.1109/TNNLS.2021.3137396.              doi:10.1007/s11432-020-2827-9.
     doi:10.1109/tnnls.2021.3137396.                            [15] L. Fan, S. Xu, D. Liu, Y. Ru, Semi-supervised com-
 [2] P. Chunaev, Community detection in node-attributed              munity detection based on distance dynamics, IEEE
     social networks: a survey, Computer Science Review              Access 6 (2018) 37261–37271. doi:10.1109/ACCESS.
     37 (2020) 100286. URL: https://www.sciencedirect.com/           2018.2838568.
     science/article/pii/S1574013720303865. doi:https://        [16] G. Paoletti, L. Gioacchini, M. Mellia, L. Vassio, J. M.
     doi.org/10.1016/j.cosrev.2020.100286.                           Almeida, Benchmarking evolutionary community
 [3] X. Huang, D. Chen, T. Ren, D. Wang, A survey of                 detection algorithms in dynamic networks, in: 4th
     community detection methods in multilayer networks,             Workshop on Graphs and more Complex structures
     Data Min. Knowl. Discov. 35 (2021) 1–45. URL: https:            for Learning and Reasoning (GCLR) at AAAI 2024,
     //doi.org/10.1007/s10618-020-00716-6. doi:10.1007/              Cornell Tech, 2024, p. 1–8. URL: https://arxiv.org/abs/
     s10618-020-00716-6.                                             2312.13784.
 [4] M. Rostami, M. Oussalah, K. Berahmand, V. Far-             [17] J. Leskovec, A. Krevl, SNAP Datasets: Stanford large
     rahi, Community detection algorithms in health-                 network dataset collection, http://snap.stanford.edu/
     care applications: A systematic review, IEEE Access             data, 2014.
     11 (2023) 30247–30272. doi:10.1109/ACCESS.2023.            [18] A. Lancichinetti, S. Fortunato, F. Radicchi, Benchmark
     3260652.                                                        graphs for testing community detection algorithms,
 [5] M. Girvan, M. E. J. Newman, Community structure                 Physical Review E 78 (2008). URL: http://dx.doi.org/10.
     in social and biological networks, Proceedings of the           1103/PhysRevE.78.046110. doi:10.1103/physreve.
     National Academy of Sciences 99 (2002) 7821–7826.               78.046110.
     doi:10.1073/pnas.122653799.                                [19] P. Vanhems, A. Barrat, C. Cattuto, J.-F. Pinton,
 [6] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefeb-         N. Khanafer, C. Régis, B.-a. Kim, B. Comte, N. Voirin,
     vre, Fast unfolding of communities in large net-                Estimating potential infection transmission routes
     works, Journal of Statistical Mechanics: Theory and             in hospital wards using wearable proximity sensors,
     Experiment 2008 (2008) P10008. URL: http://dx.doi.              PLOS ONE 8 (2013) 1–9. URL: https://doi.org/10.1371/
     org/10.1088/1742-5468/2008/10/P10008. doi:10.1088/              journal.pone.0073970. doi:10.1371/journal.pone.
     1742-5468/2008/10/p10008.                                       0073970.
 [7] N. Mehta, L. Carin, P. Rai, Stochastic blockmodels meet    [20] R. Mastrandrea, J. Fournet, A. Barrat, Contact pat-
     graph neural networks, in: K. Chaudhuri, R. Salakhut-           terns in a high school: A comparison between data
     dinov (Eds.), Proceedings of the 36th International             collected using wearable sensors, contact diaries and
     Conference on Machine Learning, ICML 2019, 9-15                 friendship surveys, PLOS ONE 10 (2015) 1–26. URL:
     June 2019, Long Beach, California, USA, PMLR, 2019,             https://doi.org/10.1371/journal.pone.0136497. doi:10.
     pp. 4466–4474.                                                  1371/journal.pone.0136497.
 [8] J. Yang, J. Leskovec, Overlapping communities explain      [21] C. Largeron, P.-N. Mougel, O. Benyahia, O. R. Zaïane,
     core–periphery organization of networks, Proceed-               Dancer: dynamic attributed networks with community
     ings of the IEEE 102 (2014) 1892–1902. doi:10.1109/             structure generation, Knowledge and Information
     JPROC.2014.2364018.                                             Systems 53 (2017) 109–151.
 [9] X. Wu, Y. Xiong, Y. Zhang, Y. Jiao, C. Shan, Y. Sun,       [22] X. V. Nguyen, J. Epps, J. Bailey, Information theo-
     Y. Zhu, P. S. Yu, Clare: A semi-supervised community            retic measures for clusterings comparison: is a cor-
     detection algorithm, in: Proceedings of the 28th ACM            rection for chance necessary?, in: Proceedings of the
     SIGKDD Conference on Knowledge Discovery and                    26th Annual International Conference on Machine
     Data Mining, KDD ’22, Association for Computing                 Learning, ICML 2009, Montreal, Quebec, Canada, June
     Machinery, New York, NY, USA, 2022, p. 2059–2069.               14-18, 2009, volume 382, ACM, 2009, pp. 1073–1080.
     URL: https://doi.org/10.1145/3534678.3539370. doi:10.           doi:10.1145/1553374.1553511.
     1145/3534678.3539370.                                      [23] J. Yang, J. J. McAuley, J. Leskovec, Community detec-
[10] L. Haonan, L. Xiaoyu, H. Linmei, J. Li, S. Xian, Z. Lin-        tion in networks with node attributes, in: H. Xiong,
     hao, W. Kaiwen, W. Hongqi, Mars: An iterative match-            G. Karypis, B. Thuraisingham, D. J. Cook, X. Wu
     ing and rewriting model for semi-supervised commu-              (Eds.), 2013 IEEE 13th International Conference on
     nity detection, Available at SSRN 4757429 (2024).               Data Mining, Dallas, TX, USA, December 7-10, 2013,
[11] K. Berahmand, Y. Li, Y. Xu, A deep semi-supervised              IEEE Computer Society, 2013, pp. 1151–1156. URL:
     community detection based on point-wise mutual in-              https://doi.org/10.1109/ICDM.2013.167. doi:10.1109/
     formation, IEEE Transactions on Computational So-               ICDM.2013.167.
     cial Systems 11 (2024) 3444–3456. doi:10.1109/TCSS.        [24] T. Chakraborty, A. Dalmia, A. Mukherjee, N. Gan-
     2023.3327810.                                                   guly, Metrics for community analysis: A survey, 2016.
[12] P. Goyal, N. Kamra, X. He, Y. Liu, Dyngem: Deep                 arXiv:1604.03512.
     embedding method for dynamic graphs, CoRR                  [25] Y. Jia, Q. Zhang, W. Zhang, X. Wang, Communitygan:
     abs/1805.11273 (2018). URL: http://arxiv.org/abs/1805.          Community detection with generative adversarial nets,
     11273. arXiv:1805.11273.                                        2019. arXiv:1901.06631.
[26] A. F. McDaid, D. Greene, N. Hurley, Normalized mu-          mining, ACM, New York, NY, USA, 2008, pp. 462–
     tual information to evaluate overlapping community          470. URL: http://dx.doi.org/10.1145/1401890.1401948.
     finding algorithms, 2013. arXiv:1110.2515.                  doi:10.1145/1401890.1401948.
[27] J. Liu, C. Xu, C. Yin, W. Wu, Y. Song, K-core based    [29] A. A. Hagberg, D. A. Schult, P. J. Swart, Explor-
     temporal graph convolutional network for dynamic            ing network structure, dynamics, and function us-
     graphs, CoRR abs/2003.09902 (2020). URL: https://           ing networkx, in: G. Varoquaux, T. Vaught, J. Mill-
     arxiv.org/abs/2003.09902. arXiv:2003.09902.                 man (Eds.), Proceedings of the 7th Python in Sci-
[28] J. Leskovec, L. Backstrom, R. Kumar, A. Tomkins, Mi-        ence Conference, Pasadena, CA USA, 2008, pp. 11
     croscopic evolution of social networks, in: KDD             – 15. URL: http://conference.scipy.org/proceedings/
     ’08: Proceeding of the 14th ACM SIGKDD interna-             SciPy2008/paper_2/.
     tional conference on Knowledge discovery and data