=Paper=
{{Paper
|id=None
|storemode=property
|title=A Versatile Algorithm for Predictive Graph Rule Mining
|pdfUrl=https://ceur-ws.org/Vol-1422/51.pdf
|volume=Vol-1422
|dblpUrl=https://dblp.org/rec/conf/itat/Vaculik15
}}
==A Versatile Algorithm for Predictive Graph Rule Mining==
<pdf width="1500px">https://ceur-ws.org/Vol-1422/51.pdf</pdf>
<pre>
J. Yaghob (Ed.): ITAT 2015 pp. 51–58
Charles University in Prague, Prague, 2015


                    A Versatile Algorithm for Predictive Graph Rule Mining

                                                        Karel Vaculík

                                             KD Lab, FI MU Brno, Czech Republic
                                                  xvaculi4@fi.muni.cz

Abstract: Pattern mining in dynamic graphs has received            The paper is organized as follows. Section 2 gives the
a lot of attention in recent years. However, proposed meth-     necessary definitions and presents the representation of
ods are typically limited to specific classes of patterns ex-   dynamic graphs. Section 3 then provides a short descrip-
pressing only a specific types of changes. In this paper,       tion of the gSpan algorithm [13], whose framework is em-
we propose a new algorithm, DGRMiner, which is able to          ployed in our new algorithm. In Section 4, we describe the
mine patterns in the form of graph rules capturing vari-        new algorithm for graph rule mining. Experimental eval-
ous types of changes, i.e. addition and deletion of vertices    uation is presented in Section 5. Finally, related work and
and edges, and relabeling of vertices and edges. This al-       conclusion can be found in Section 6 and 7, respectively.
gorithm works both with directed and undirected dynamic
graphs with multiedges. It is designed both for the single-
dynamic-graph and the set-of-dynamic-graphs scenarios.          2       Predictive Graph Rules
The performance of the algorithm has been evaluated by
using two real-world and two synthetic datasets.                In this section, we provide definitions of a dynamic graph
                                                                and predictive rules. Definitions of significance measures
                                                                support and confidence are also part of this section.
1 Introduction
                                                                2.1 Dynamic Graphs
Data mining of complex structures has been extensively
studied for quite a while. Recently, more attention has         Before defining a dynamic graph, we need to consider the
been drawn to the area of dynamic graphs, i.e. graphs           notion of a static graph. In this paper, by a static graph we
evolving through time. A lot of research has been car-          will denote a directed labeled multigraph without loops
ried out both from the global and local perspectives of dy-     and with a restriction that no two edges with the same
namic graphs. For instance, global characteristics such as      source and target vertices can have the same label. The
density and diameter were studied in real dynamic graphs        proposed mining algorithm, however, can work with undi-
in [6]. On the other hand, graph mining on the local level is   rected edges too. For the sake of simplicity, we will restrict
most frequently focused on pattern mining. Such patterns        ourselves only to directed graphs in this section.
are represented by subgraphs and their evolution through
a short period of time [1, 2, 8].                               Definition 1 (Static graph). A static graph is a 5-tuple G =
   Nevertheless, most of the pattern mining methods for         (VG , EG , fG , lG,V , lG,E ), where VG is a set of vertices, EG is
dynamic graphs impose various restrictions, such as the         a set of edges, fG : EG → VG × VG is a map assigning a
type of the dynamic graphs or the type of changes cap-          pair of vertices (u, v), u 6= v, to every edge, lG,V and lG,E
tured by such patterns. For example, GERM algorithm [1]         are two maps describing labeling of the vertices and edges,
assumes that vertices and edges are only added and never        respectively. Furthermore, ∀e1 , e2 ∈ EG ( f (e1 ) = f (e2 ) ⇒
deleted in the input dynamic graph. Furthermore, it mines       lE (e1 ) 6= lE (e2 )).
patterns representing only edge additions.                         A dynamic graph is then given by a finite sequence of
   In this paper, we propose a new algorithm DGRMiner           static graphs in which no two adjacent graphs are identi-
for mining frequent patterns that can capture various           cal as we want to capture only the changes in the dynamic
changes in dynamic graphs. Specifically, the patterns are       graph. Moreover, we extend each static graph G by times-
in the form of predictive rules expressing how a subgraph       tamp functions tG,V : VG → T and tG,E : EG → T , which
can be changed into another subgraph by adding new ver-         map each vertex and edge to a point in time. We will work
tices and edges, deleting specific vertices and edges, or       with discretized time and thus T will be the set of integers,
relabeling vertices and edges. An example of a dynamic          i.e. T = Z.
graph and two predictive rules are illustrated in Fig. 1.
   The algorithm is able to mine patterns from a single dy-     Definition 2 (Dynamic graph). A dynamic graph is a finite
namic graph and also from a set of dynamic graphs. Such         sequence DG = (G1 , G2 , ..., Gn ), where Gi is a static graph
graph rules are useful for prediction in dynamic graphs,        extended by timestamp functions tGi ,V , tGi ,E for all 1 ≤ i ≤
they can be used as pattern features representing dynamic       n, and G j 6= G j+1 for all 1 ≤ j ≤ n-1. Graph Gi is referred
graphs or simply for gaining an insight into internal pro-      to as the snapshot of DG at time i. In addition, for each
cesses of the graphs.                                           1 ≤ i ≤ n, the timestamp functions tGi ,V , tGi ,E assign to
52                                                                                                                            K. Vaculík


Figure 1: An example of a dynamic graph and two predictive graph rules. Numbers after slash symbols represent times-
tamps and dotted edges represent deleted edges.


each vertex and edge the time from which they have their               i. VGA = 0/ ∧VGC 6= 0/ ∧ ∀v ∈ VGC (tGC ,V (v) = 0) ∧
current label, i.e. tGi ,V (v) = min( j|1 ≤ j ≤ i ∧ ∀k, j ≤ k ≤           ∀e ∈ EGC (tGC ,E (e) = 0)
i, v ∈ VGk ∧ lGk ,V (v) = lGi ,V (v)) and similarly for tGi ,E (e).
                                                                       ii. VGA 6= 0/ ∧VGC = 0/
   As we want to capture patterns with rich information,
                                                                      iii. (VGA ∩VGC 6= 0)∧  /
we will also assume that the dynamic graph keeps track
                                                                           (∃v ∈ VGC (tGC ,V (v) = 0) ∨ ∃e ∈ EGC (tGC ,E (e) = 0))∧
of the deleted vertices and edges, but only until they
                                                                           (∀v ∈ VGC rVGA (tGC ,V (v) = 0))∧
are added back into the graph. For example, consider
                                                                           (∀e ∈ EGC r EGA (tGC ,E (e) = 0))∧
the dynamic graph in Fig. 1 with five snapshots. Then
                                                                           (∀v ∈ VGA ∩VGC ((tGC ,V (v) = tGA ,V (v) − 1 ∧ lGC ,V (v) =
tG1 ,V (v1 ) = 1, tG4 ,V (v5 ) = 3, etc. Also notice the edge be-
                                                                           lGA ,V (v)) ∨ (0 = tGC ,V (v) ≥ tGA ,V (v) ∧ lGC ,V (v) 6=
tween vertices v3 and v1 in snapshot G3 . It is deleted in
                                                                           lGA ,V (v)))∧
snapshot G4 but we keep information about this deleted
                                                                           (∀e ∈ EGA ∩ EGC ( fGC (e) = fGA (e) ∧ ((tGC ,E (e) =
edge in this snapshot. This information is discarded in
                                                                           tGA ,E (e) − 1 ∧ lGC ,E (e) = lGA ,E (e)) ∨ (0 = tGC ,E (e) ≥
snapshot G5 because a new edge with exactly the same
                                                                           tGA ,E (e) ∧ lGC ,E (e) 6= lGA ,E (e))))
information is added there.
                                                                        Then we say that GA ⇒ GC is a predictive graph rule,
                                                                      where GA is called antecedent and GC consequent.
2.2     Predictive Rules
                                                                         The first two conditions in the above definition cover
The aim of the mining algorithm is to find predictive graph           situations in which the rules express either addition of an
rules, i.e. rules expressing how a subgraph of a snapshot             isolated graph into a dynamic graph or a deletion of a sub-
will most likely change in future. As we want to incor-               graph from a dynamic graph. The third condition cov-
porate the time information into the rules and at the same            ers situations in which one subgraph is transformed into
time we are interested in mining general patterns which               another subgraph. Here, we require VGA ∩ VGC 6= 0/ be-
are not tied to absolute time, we need to use relative times-         cause we are not interested in rules consisting of unrelated
tamps for rules. A relative timestamp equal to 0 will de-             graphs. In addition, we require the rule to contain at least
note a current change and timestamp equal to −t will de-              one change related to a vertex or an edge. Vertices and
note a change that happened t snapshots earlier. Now we               edges occurring only in consequent must have timestamp
can define predictive graph rules as follows.                         equal to 0 as they represent an addition. For vertices and
Definition 3 (Predictive Graph Rule). Let GA , GC be two              edges common for both graphs we require that they either
static graphs with timestamp functions tGA ,V , tGA ,E , tGC ,V ,     were not changed a thus their relative timestamps differ by
tGC ,E restricted to range (−∞, 0] such that union graph1 of          one, or they were changed and thus their timestamp cannot
GA and GC is a connected graph and exactly one of the                 be lower in the consequent. Moreover, we cannot change
following conditions holds:                                           edges by reorienting them, i.e. we would have to delete
                                                                      the original edge and add a new one with the opposite ori-
      1 A graph created from union of vertices and edges.             entation. Lastly, as we keep track of the deleted edges and
A Versatile Algorithm for Predictive Graph Rule Mining                                                                                 53


vertices in the dynamic graph, the predictive rules can also     3    gSpan Revisited
contain these deleted vertices and edges. It does not pose
any restriction to the patterns, contrariwise it can only help   The novel algorithm for mining predictive graph rules em-
us capture more information in the patterns in case such         ploys the framework of the gSpan algorithm [13]. We
information is present in the dynamic graph.                     modified and further extended this framework for the pur-
   In Fig. 1 we can see two examples of graph prediction         pose of mining graph rules from dynamic graphs. First,
rules. Both rules depict changes in connection and also          we revise the main ideas of gSpan and then we provide the
label changes. There is also a vertex with label A in both       details of the new algorithm.
rules which is not changed and thus its timestamp is de-             gSpan [13] is an algorithm for mining frequent patterns
creased by one in the consequent.                                (subgraphs) from a set of simple undirected static graphs.
   In order to select only interesting rules, various mea-       Simple means that it does not handle multiedges. The out-
sures of significance are typically used. Here, we use sup-      put frequent subgraphs are connected.
port and confidence. As we use relative timestamps for               gSpan starts from single-edge patterns and extends
graphs in rules, we also need to provide the notion of an        these patterns edge by edge to create larger patterns.
occurrence of such a graph in the given dynamic graph.           Each such pattern can be encoded by a DFS (Depth-First
Definition 4 (Occurrence of a rule graph). Let G be a            Search) code. A DFS code of a pattern represents a spe-
graph used in a rule, say in antecedent without loss of          cific DFS traversal of the pattern and it is represented by
generality, with timestamp functions tG,V , tG,E , and let       a list of 5-tuples (i, j, li , l(i, j) , l j ). Such a 5-tuple represents
DG = (G1 , ..., Gi , ..., Gn ) be a dynamic graph. We say that   an edge between the i-th and j-th discovered vertices by
G occurs in snapshot Gi , written as G v Gi , if there exists    the DFS traversal, li and l j are labels of those vertices, and
a function ϕ : VG → VGi such that:                               l(i, j) is the label of the edge. Thus, the first 5-tuple has al-
                                                                 ways i = 0 and j = 1, and it holds for other 5-tuples that
  i. ∀u ∈ VG (ϕ(u) ∈ VGi ∧ lG,V (u) = lGi ,V (ϕ(u)) ∧            i < j if it is a forward edge in the DFS traversal and i > j
     tG,V (u) = tGi ,V (ϕ(u)) − i),                              if it is a backward edge.
 ii. ∀e ∈ EG ( fG (e) = (u, v) ⇒ ∃!e0 ∈ EGi ( fGi (e0 ) =            As there are more ways the DFS traversal can be per-
     (ϕ(u), ϕ(v)) ∧ lG,E (e) = lGi ,E (e0 ) ∧ tG,E (e) =         formed on a single pattern, there are also more DFS codes
     tGi ,E (e0 ) − i)                                           for each pattern. A lexicographic order is defined on DFS
                                                                 codes and the minimum one is maintained for each pat-
Definition 5 (Support and Confidence). Let GA ⇒ GC be            tern. This lexicographic order is also applied on codes of
a rule and DG = (G1 , ..., Gi , ..., Gn ) a dynamic graph. We    different patterns to represent the search space as a tree,
define support of GA ⇒ GC , support of GA , and confidence       called DFS Code Tree. In this DFS Code Tree, each vertex
of GA ⇒ GC as follows:                                           represents one DFS code and children of a vertex can be
    σDG (GA ⇒ GC ) = |{i|GA v Gi , GC v Gi+1 ,                   obtained by all possible single-edge extensions of the ver-
                                                                 tex. Therefore all codes on the same level of the tree have
                             1 ≤ i ≤ n − 1}|/(n − 1)
                                                                 the same number of edges. Moreover, children of a vertex
            σDG (GA ) = |{i|GA v Gi , 1 ≤ i ≤ n − 1}|/(n − 1)    are ordered according to the lexicographic order. gSpan
con fDG (GA ⇒ GC ) = σDG (GA ⇒ GC )/σDG (GA )                    generates larger patterns in such a way that it corresponds
   For a set of dynamic graphs DGS =                             to a depth-first search traversal of this DFS Code Tree, i.e.
{DG1 , DG2 , ..., DGm } we extend these definitions as           it generates patterns according to the lexicographic order.
follows:                                                         gSpan does not have to extend each pattern in all possi-
                                                                 ble ways, it is enough to grow edges only from vertices on
     σDGS (GA ⇒ GC ) = |{i|σDGi (GA ⇒ GC ) > 0,                  the rightmost path2 . Specifically, it grows either a back-
                               1 ≤ i ≤ m}|/m                     ward edge from the rightmost vertex3 to another vertex
             σDGS (GA ) = |{i|σDGi (GA ) > 0, 1 ≤ i ≤ m}|/m      on the rightmost path or a forward edge from a vertex on
                                                                 the rightmost path to a newly introduced vertex. When
 con fDGS (GA ⇒ GC ) = σDGS (GA ⇒ GC )/σDGS (GA )                traversing the search space, gSpan checks whether the pat-
   Thus, support of a rule for a single dynamic graph ex-        tern of the considered DFS code is frequent. If not, it
presses the fraction of snapshots that were changed by the       prunes the search space tree on this vertex and backtracks.
rule. For a set of dynamic graphs, we count the fraction         This is possible because of the anti-monotonicity of the
of dynamic graphs that had at least one snapshot changed         support measure. It also checks whether the considered
by the rule. Confidence expresses the frequency of such a        code is the minimum one for the corresponding pattern.
change if we observe an occurrence of the antecedent. For        If it is not minimum, the search space tree is pruned on
example, both rules in Fig. 1 have support equal to 0.25         this vertex because all patterns in this pruned subtree were
and confidence equal to 1.                                       already found earlier.
   Given a minimum support value σmin and a minimum                  2 The rightmost path is given by the DFS code and it is the path from
confidence value con fmin , the task is to find all predictive   the root to the lastly discovered vertex by the DFS traversal.
graph rules for which σ ≥ σmin and con f ≥ con fmin .                 3 The last vertex on the rightmost path from root.
54                                                                                                                             K. Vaculík


Algorithm 1 gSpan(D,S)                                                     graph as a set of static graphs in such a way that a modified
 1: sort the labels in D by their frequency;                               static subgraph mining algorithm can be employed.
 2: remove infrequent vertices and edges;                                     Let us start with the transformation of rules. In order
 3: relabel the remaining vertices and edges;                              to create a single graph from a rule, we take the union of
 4: S1 ← all frequent 1-edge graphs in D;                                  the vertices and edges from its antecedent and consequent.
 5: Sort S1 in DFS lexicographic order;                                    Edges and vertices that do not represent any change in the
 6: S ← S1 ;                                                               rule will keep their labels and timestamps from the conse-
 7: for each edge e ∈ S1 do                                                quent. Let us remind that rules have relative timestamps
 8:     initialize s with e, set s.D by graphs containing e;               less than or equal to 0 and thus these edges and vertices
 9:     Subgraph_Mining(D,S,s);                                            will have timestamps less than 0. Edges and vertices rep-
10:     D ← D − e;                                                         resenting addition will keep their consequent timestamp,
11:     if |D| < σmin then                                                 which is 0, but their labels will contain flag representing
12:          break;                                                        addition, for example label A will be changed to +A. How-
                                                                           ever, timestamps of vertices and edges that were deleted
Algorithm 2 Subgraph_Mining(D,S,s)                                         or relabeled will have timestamps that are opposites of the
                                                                           antecedent timestamps. We know that consequent times-
 1: if s 6= min(s) then
                                                                           tamps of such changes are equal to 0 so we can easily get
 2:       return;
                                                                           the original value of the antecedent. We take the oppo-
 3: S ← S ∪ {s};
                                                                           site values because later it will help us recognize current
 4: enumerate s in each graph in D and count its children;
                                                                           changes simply by timestamps greater than or equal to 0.
 5: for each c, c is s’ child do
                                                                           Vertices and edges that were deleted or relabeled will also
 6:       if σ (c) ≥ σmin then                                             have new labels that can be easily decoded, for example
 7:           s ← c;                                                       −A for deletion of an object with label A and A => B for
 8:           Subgraph_Mining(Ds ,S,s);                                    relabeling from A to B. Transformed rules from Fig. 1 are
                                                                           shown in Fig. 2.
                                                                              Transformation of the input dynamic graph is very sim-
   The pseudocode of gSpan is given in Algorithm 1.                        ilar. Suppose that we have n snapshots in the dynamic
By removing the infrequent vertices and edges, the input                   graph. As the first snapshot does not represent any changes
graphs can be significantly reduced and the overall effi-                  by itself, we create n − 1 new graphs in the following
ciency increased. Frequent vertices are appended to re-                    way. When creating the k-th graph, consider union of ver-
sults as the smallest frequent patterns. The main part of                  tices and edges from snapshots 1 to k as an antecedent,
the algorithm starts from single-edge patterns. Specifi-                   where vertices and edges have their last assigned labels
cally, Subgraph_Mining 2 procedure is recursively called                   and timestamps of last changes relative to k. We can as-
on each such pattern. This procedure first tests whether                   sume that all vertices and edges from the first snapshot had
the code s is minimum. If it is minimum, it enumerates its                 timestamps equal to 1. Similarly, use snapshots 1 to k + 1
children by taking single-edge extensions. The procedure                   to create a consequent. Then we use the method for rule
is then called on the frequent children.                                   transformation to create the k-th graph. All n − 1 graphs
                                                                           can be computed in a single pass as we can update the i-th
4 DGRMiner Algorithm                                                       graph to get the (i + 1)-th one.
                                                                              Union of all graphs from the beginning may contain ver-
In this section we describe the new algorithm called DGR-                  tices and edges with very old changes that are not useful
Miner. It is based on the framework of gSpan, however,                     for the predictive rules. We use a window parameter to re-
the framework is modified and extended. First, we provide                  move such vertices and edges from the union graphs. As
necessary details about main modifications used in DGR-                    edges cannot exist without their adjacent vertices, we do
Miner and then we present the pseudocode of the whole                      not remove old vertices adjacent to edges that are not old.
algorithm with description of remaining building blocks.                   Union graphs of the dynamic graph from Fig. 1 are shown
                                                                           in Fig. 2. In this case, window size is not set.

4.1   Static Representation of the Dynamic Graphs by
      Union Graphs                                                         4.2 Modified DFS Code
The first step is a transformation of an input dynamic
                                                                           Now that we have the dynamic graph represented by union
graph4 into a data structure that can be considered as a
                                                                           graphs, which can be viewed as a set of static graphs, we
set of static graphs. The idea is that we are able to repre-
                                                                           made a large step towards mining the graph rules. There
sent the graph rules by single graphs and the input dynamic
                                                                           are, however, still several issues to be addressed.
    4 Here, we assume that there is only one dynamic graph on the input.      Let us start with a richer representation of edges. In
Extension to a set of dynamic graphs is described in Subsection 4.4.       Section 3, we showed that gSpan uses 5-tuples of the
A Versatile Algorithm for Predictive Graph Rule Mining                                                                                           55


                                                                                         The first method helps us to ignore timestamps of ver-
                                                                                      tices. Specifically, we apply the signum function to all
                                                                                      relative timestamps of vertices. Thus, negative times-
                                                                                      tamps become equal to −1 and positive timestamps be-
                                                                                      come equal to 1. This method is useful for dynamic graphs
                                                                                      in which all or almost all changes are caused by edges and
                                                                                      vertices remain more or less intact.
                                                                                         The second method also uses the signum function but
                                                                                      now it converts timestamps of both vertices and edges. It is
                                                                                      useful in situations where patterns in dynamic graphs are
                                                                                      very diverse and it is not possible to find many frequent
                                                                                      patterns with exact timestamps.

                                                                                      4.4 The Complete DGRMiner
                                                                                      In this section we provide remaining details and a de-
                                                                                      scription of the whole DGRMiner algorithm for predictive
                                                                                      graph rule mining. The pseudocode of DGRMiner is given
                                                                                      in Algorithm 3.
                                                                                         First, DGRMiner converts the input dynamic graph into
                                                                                      a set of union graphs as described in Section 4.1. In the
                                                                                      case of a set of dynamic graphs, the algorithm simply com-
                                                                                      putes union graphs for each one of them and then con-
                                                                                      catenates the results. It only needs to keep the mapping
                                                                                      of those union graphs into the original dynamic graphs in
                                                                                      order to be able to compute their support and confidence
Figure 2: The union graph representation of the dynamic                               correctly. Optional application of an abstraction method,
graph and the rules from Fig. 1.                                                      described in Section 4.3, follows next. Then the algorithm
                                                                                      removes infrequent vertices and edges but only those that
form (i, j, li , l(i, j) , l j ) to represent edges of patterns. In or-               represent changes as the other ones may be used later for
der to incorporate relative timestamps of rules and orien-                            confidence computation. When computing frequencies, it
tation of the edges, we simply extend these 5-tuples to                               takes labels, timestamps, and edge orientations into ac-
9-tuples of the form (i, j, li ,ti , d(i, j) , l(i, j) ,t(i, j) , l j ,t j ). It is   count. Before moving to single-edge patterns, DGRMiner
the same as the original 5-tuple except for the new el-                               outputs frequent single-vertex patterns with high enough
ements. Specifically, ti , t j , t(i, j) are used for the rela-                       confidence. To compute confidence, it needs to decode an-
tive timestamps of vertex i, vertex j, and the edge be-                               tecedents of the patterns and then compute their support.
tween i and j. Element d(i, j) represents the orientation                             After that, the algorithm takes frequent initial edges and
of the edge between i and j, and it is one of the fol-                                sort them according to the extended version of the DFS
lowing: ←, →, −. The last one is used for undirected                                  lexicographic order of gSpan. An initial edge is such an
edges. Each pattern, i.e. graph rule in the condensed rep-                            edge that represents a change or at least one of its vertices
resentation, can be represented as a list of such 9-tuples.                           does.
Furthermore, it is easy to extend the gSpan’s DFS Code                                   Now for each initial edge we recursively call subproce-
for these 9-tuples and thus we can create ordering be-                                dure DGR_Subgraph_Mining, which searches for patterns
tween patterns and find a minimum DFS Code for each                                   growing from a given initial edge. This subprocedure, de-
pattern. For example, suppose that we obtained the fol-                               scribed in Algorithm 4, uses several arguments. s denotes
lowing order of vertex labels: A, +x, C=>D, A=>B, -x.                                 the current pattern, which is represented by its DFS Code.
Then the minimum code of the Predictive Rule 1 from                                   In D and A we keep union graphs in which current pattern
Fig. 2 is (0, 1, A, −1, →, +x, 0, A => B, 1), (0, 2, A, −1, →,                        and its antecedent can be found. Finally, when growing
+x, 0, A => B, 1), (0, 3, A, −1, ←, −x, 1,C => D, 1).                                 patterns from the i-th initial edge, we keep the first i ini-
                                                                                      tial edges in Estart . This last argument is used in function
                                                                                      min, which can be found in the first line of Algorithm 4.
4.3     Time Abstraction
                                                                                      The purpose of this function is to check whether the DFS
In order to be able to deal with a broader class of dynamic                           code of the given pattern is minimum, i.e. it was not found
graphs, we extended the mining algorithm to include two                               earlier when traversing the search space. Because all pat-
time abstraction methods. By time abstraction we mean                                 terns grow only from the initial edges S1 , it is enough to
usage of coarser timestamp values of union graphs in situ-                            check whether we cannot represent the current pattern by
ations where exact values are not required or suitable.                               a smaller DFS Code which starts by one of the edges in
56                                                                                                                 K. Vaculík


Algorithm 3 DGRMiner(DG)                                             Dataset           Dynamic graphs       Snapshots
 1: convert the input dynamic graph(s) DG into the union           ENRON                            1             895
    graph representation D;                                        RESOLUTION                     103           2911
 2: optional: apply a time abstraction method on union             SYNTH                            1             101
    graphs;                                                        SYNTH 20                        20           2020
 3: remove infrequent vertices and edges;
 4: output frequent change vertices with high enough con-                Table 1: Datasets used for experiments.
    fidence;
 5: S1 ← all frequent initial edges in D sorted in DFS lex-      Rank                                       Vertex label
    icographic order;                                            Employee                                   Emp
 6: for i ← 1 to |S1 | do                                        Vice President                             VP
 7:     p ← i-th edge from S1                                    Director                                   Dir
 8:     p.D ← graphs which contain p;                            President                                  Pres
 9:     p.A ← graphs which contain antecedent of p;              Manager                                    Man
10:     Estart ← first i edges from S1                           Trader                                     Trad
11:     DGR_Subgraph_Mining(p,p.D,p.A,Estart );                  CEO                                        CEO
                                                                 Managing Director Legal Department         MDLP
Algorithm 4 DGR_Subgraph_Mining(s,D,A,Estart )                   In House Lawyer                            Law
 1: if s 6= min(s, Estart ) then
                                                                 Managing Director                          MD
 2:       return;
                                                                Table 2: Ranks of employees in the Enron dataset and the
 3: enumerate s in each graph in D and count its children;      corresponding vertex labels.
 4: remove children of s which are infrequent;
 5: enumerate antecedent of s in graphs given by A;
 6: set s.A by graphs which contain antecedent of s;            implementation of DGRMiner on a PC equipped with
 7: con f ← confidence of s;                                    CPU Intel i5-4570, 3.2GHz, 16GB of main memory, and
 8: if con f ≥ con f min then                                   running 64-bit version of Windows 8.1. For all exper-
 9:       output s;                                             iments, we set con fmin = 0 and window size for union
10: sort remaining children in DFS lexicographic order;         graphs equal to 10.
11: for each child c do
12:       DGR_Subgraph_Mining(c,c.D,s.A,Estart );
                                                                5.1 Enron

                                                                The first dataset used in experiments is based on the email
Estart . If we can find such a smaller code, then the pattern   correspondence in the Enron company [3]. For our ex-
must have been discovered earlier a thus we backtrack.          periments we used preprocessed version of this email traf-
   If the code is minimum, we continue by enumerating           fic [10]. Specifically, we used the data containing informa-
the pattern in relevant graphs given by D and searching for     tion about time, sender, receiver, and LDC topic. From the
its children candidates. This step is similar to the one in     set of senders and receivers we created vertices of our dy-
gSpan. Also, all infrequent children are removed. Before        namic graph. These vertices do not change through time.
saving the current pattern, we need to compute its confi-       Each email message sent between a sender and a receiver
dence. As we described in Section 4.1, we are able to ex-       is represented by addition of a directed edge in the dy-
tract the antecedent from the current pattern and then count    namic graph. If the graph already contains the same edge,
its occurrences. Set A represents a set of candidate graphs,    we just update its time of addition. Snapshots of the dy-
in which we should search for the antecedent occurrences.       namic graph corresponds to days. As there were messages
The actual set of graphs containing the antecedent is then      with anomalous dates, we removed all messages sent be-
saved to s.A, where s.A ⊆ A, and it used as the A set for       fore 1998 and got 894 days of activity. With one extra
the pattern’s children.                                         day for vertex initialization we got 895 snapshots, as can
   Before recursive processing of the children of the cur-      be seen in Table 1. We used LDC topics as edge labels.
rent pattern, we need to sort the children according to the     There are 32 regular LDC topics expressing the topics of
extended version of the DFS lexicographic order. Set c.D        the messages plus two special topics used to label outlier
was created when the pattern s was enumerated and its           messages and messages with non-matching topic. We also
children were counted.                                          used rank of employees [10] to label vertices. Vertices
                                                                with unknown rank were removed from the graph and thus
5 Experiments                                                   only 130 vertices remained. Ranks and corresponding la-
                                                                bels are available in Table 2.
In this section we present results of experiments on three         Results on ENRON with σmin = 0.1 can be found in the
datasets. All the experiments were conducted by a C++           first row in Table 3. We decided to apply the time abstrac-
A Versatile Algorithm for Predictive Graph Rule Mining                                                                           57


                             Union       Union                         Time abstraction       1-vertex     All     Running
          Dataset                                    σmin   con fmin
                            vertices      edges                        vertices   all           rules     rules   time (sec)
      ENRON                   46290      182720      0.10         0       X        ×                  0    187          24.6
      ENRON DEL               47612      196653      0.10         0       X        ×                  0    233          29.3
      RESOLUTION              15966        5275      0.05         0       ×        ×                26      36           0.3
      RESOLUTION              15966        5275      0.05         0       ×        X                17     321           3.4
      SYNTH                    1124        2404      0.10         0       ×        X                  6     82           0.3
      SYNTH 20                31455       52112      0.10         0       ×        ×               121    1604          50.9

Table 3: Results of experiments. Number of union vertices and edges is taken over all union graphs of the given dataset.
1-vertex rules are rules whose union graph consists of only one vertex. Running time is averaged over five runs. For all
experiments we set window of size 10 when building union graphs.


tion method for vertices because they are only added in the
first snapshot and never changed. We found 187 frequent
rules, none of which was a single-vertex rule.
   We also modified the previous dataset by deleting edges
which were not updated immediately the next day. This
modified dataset is named ENRON DEL in Table 3. The
change allows us to capture patterns which could not be
captured only by edge additions. Examples of two rules
from this dataset are shown in Fig. 3.
                                                                           Figure 3: Examples of two rules from ENRON DEL.
5.2    Resolution Proofs in Propositional Logic
We used the set of graphs representing resolutions in                  with 10 vertices and 20 randomly assigned edges was cre-
propositional logic from [11] as the second dataset. Ver-              ated. Then we iteratively built 100 snapshots, each snap-
tices of these graphs contain lists of literals in proposi-            shot from the previous one by randomly chosen changes.
tional logic. All edges are directed and have the same la-             The number of changes ranged uniformly 0–1 for vertex
bel in these graphs. These dynamic graphs are evolving                 deletion, 0–1 for edge deletion, 0–1 for vertex addition, 0–
by vertex and edge addition or deletion, and by change of              3 for edge addition, 0–2 for vertex label change, and 0–2
vertex labels. Time of these events was transformed into a             for edge label change. All newly selected vertex (edge)
discrete sequence. Because there were 19 different assign-             labels were chosen from a uniform distribution over set
ments in total, the dynamic graphs had quite distinct vertex           {A, B} ({y, z}). Each new snapshot had to be different from
labels. In order to find frequent patterns, we restricted the          the previous one. In order to keep approximately the same
dataset to only one assignment. Specifically, we took the              number of vertices (edges) through time, additions or dele-
assignment with the greatest number of solutions. This set             tions of vertices (edges) were suspended if the number of
of graphs contained 103 dynamic graphs with 2911 snap-                 vertices (edges) was not in the [k/2, 2k] interval, where k
shots in total, see RESOLUTION dataset in Table 1. The                 = 10 (and k = 20 for edges). The second dataset, SYNTH
initial snapshot of each dynamic graph in an empty graph.              20, was created from 20 dynamic graphs, each one of them
For more details about this data refer to [11].                        built by the process just described.
   We conducted two experiments on this dataset, both                     For SYNTH and σmin = 0.1, the time abstraction of both
with σmin = 0.05 as there were not many frequent patterns.             vertices and edges was applied because there were almost
One with no time abstraction, and one with time abstrac-               no frequent patterns without the abstraction. On the other
tion of both vertices and edges. From Table 3 we can see               hand, experiments on SYNTH 20 with 20 dynamic graphs
that the time abstraction helped us to find ten times more             did not require any time abstraction and approximately
frequent rules for the same value of the minimum support.              1600 frequent rules were found for the same value of the
Furthermore, most of the rules were 1-vertex rules when                minimum support. This suggests that the support defini-
the abstraction was not applied. Such rules capture vertex             tion for a single dynamic graph is stricter than the one for
additions, deletions and relabelings without any context               a set of graphs.
and may not be very informative.

                                                                       6    Related Work
5.3    Synthetic Datasets
Besides real-world data, we also tested our method on a                Several algorithms have been proposed for graph rule min-
synthetic datasets. One of this dataset, SYNTH in Ta-                  ing. However, a lot of these algorithms are expecting
ble 1, was generated in the following way. First, a graph              that the only changes in a dynamic graph are caused by
58                                                                                                                      K. Vaculík


edge additions, or by vertex additions if the vertices be-      dynamic graphs are allowed to be directed or undirected
long to the edges being added. These algorithms include         multiedges. DGRMiner uses support and confidence as
GERM [1], LFR-Miner [7], and the algorithms presented           significance measures of the rules. Such graph rules are
in [5, 9]. These algorithms also pose various restrictions      useful for prediction in dynamic graphs, they can be used
on the form of the rule graphs.                                 as pattern features representing dynamic graphs or sim-
   The work most related to ours is probably the algorithm      ply for gaining an insight into internal processes of the
for mining interesting patterns and rules proposed in [8].      graphs. We evaluated the algorithm on two real-world and
This algorithm allows multiedges and also performs sim-         two synthetic datasets.
ilar time abstraction, however, it also supposes that labels       As future work, we plan to further investigate time ab-
do not change and rules express only edge addition. More-       straction methods and other ways of support computation.
over, antecedents of the rules have to be connected. This       We also intend to modify the method for mining frequent
is a stricter requirement than the one given by our algo-       closed patterns.
rithm, which requires only the union graphs of rules to be
connected.                                                      Acknowledgments. We would like to thank to Luboš
   Predictive graph rules can be also seen as subgraph se-      Popelínský and other members of KDLab FI MU for their
quences of length two. Mining of subgraph subsequences          help and comments.
from a set of graph sequences (dynamic graphs) is con-
sidered in [4], where algorithm FRISSMiner is proposed.
This algorithm also works with union graphs, however, it
                                                                References
builds the union graph from the whole sequence, i.e. the         [1] Berlingerio, M., Bonchi, F., Bringmann, B., Gionis, A.:
whole dynamic graph. FRISSMiner allows all types of                  Mining graph evolution rules. In: ECML PKDD’09.
changes in the input graph sequences but the subgraph se-            Springer-Verlag, Berlin, Heidelberg, 2009, 115–130
quences are not required to include the changes and thus         [2] Borgwardt, K. M., Kriegel, H. -P., Wackersreuther, P.: Pat-
user would still need to search for patterns representing            tern mining in frequent dynamic subgraphs. In: IEEE
changes in graphs. In addition, the algorithm is not de-             ICDM’06. Washington, DC, USA, 2006, 818–822
signed for a single-dynamic-graph scenario.                      [3] Cohen, W. W.: Enron email dataset. Web. Accessed 2 June
   Dynamic GREW [2] and the algorithm from [12] mine                 2015. urlhttps://www.cs.cmu.edu/~./enron/.
also patterns capturing information from several snap-           [4] Inokuchi, A., Washio, T.: Mining frequent graph sequence
shots. They assume that the input dynamic graph has a                patterns induced by vertices. In: SIAM SDM, 2010.
fixed set of vertices, and edges are inserted and deleted        [5] Kabutoya, Y., Nishida, K., Fujimura, K.: Dynamic net-
over time, which makes them quite restricted.                        work motifs: evolutionary patterns of substructures in com-
   Except for FRISSMiner, the algorithms presented in this           plex networks. In: Proceedings of the APWeb’11. Springer-
section are designed only for the single-dynamic-graph               Verlag, Berlin, Heidelberg, 2011, 321–326
scenario. On the contrary, FRISSMiner is designed only           [6] Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over
for the set-of-dynamic-graphs scenario.                              time: densification laws, shrinking diameters and possi-
                                                                     ble explanations. In: ACM KDD’05, New York, NY, USA,
   Experimental comparison of the above algorithms with
                                                                     2005.
DGRMiner is difficult due to the fact that each algorithm
                                                                 [7] Leung, C. W. -K., Lim, E. -P., Lo, D., Weng, J.: Mining in-
mines different type of patterns. Also, different definitions
                                                                     teresting link formation rules in social networks. In: ACM
of support and confidence affect which patterns are fre-
                                                                     CIKM’10, New York, NY, USA, 2010, 209–218
quent. For example, GERM computes absolute support
                                                                 [8] Miyoshi, Y., Ozaki, T., Ohkawa, T.: Mining interesting pat-
and counts multiple occurrences of a pattern in a snapshot.
                                                                     terns and rules in a time-evolving graph. In: IMECS’11,
On the other hand, DGRMiner computes support of a pat-               Vol. I, March 16–18, 2011.
tern as a fraction of snapshots containing at least one oc-
                                                                 [9] Ozaki, T., Etoh, M.: Correlation and contrast link formation
currence of the pattern. Another difficulty arises from the          patterns in a time evolving graph. In: IEEE ICDMW’11,
fact that implementations of the algorithms, with the ex-            Washington, DC, USA, 2011, 1147–1154
ception of GERM, are not freely available for download.         [10] Priebe, C. E., Conroy, J. M. , Marchette, D. J., Park, Y.:
Lastly, the datasets used for experimental evaluation of the         Scan statistics on Enron graphs. Web. Accessed 8 June
above algorithms are often either not available or it is not         2015. <http://www.cis.jhu.edu/ parky/Enron>
possible to reproduce the same data.                            [11] Vaculík, K., Nezvalová, L., Popelínský, L.: Graph min-
                                                                     ing and outlier detection meet logic proof tutoring. In: G-
                                                                     EDM 2014, London, CEUR-WS.org, 2014, 43–50
7    Conclusion                                                 [12] Wackersreuther, B., Wackersreuther, P., Oswald, A.,
                                                                     Böhm, C., Borgwardt, K. M.: Frequent subgraph discov-
We proposed a new algorithm DGRMiner for mining fre-                 ery in dynamic networks. In: MLG’10, ACM, New York,
quent predictive graph rules from both single dynamic                NY, USA, 2010, 155–162
graphs and sets of dynamic graphs. This algorithm is able       [13] Yan, X., Han, J.: gSpan: Graph-based substructure pattern
to capture various changes in dynamic graphs. Edges in               mining. In: IEEE ICDM’02, Washington, DC, USA, 2002.

</pre>