<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Finding NeMo: Fishing in banking networks using network motifs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xavier Fontes</string-name>
          <email>xavier.fontes@feedzai.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Beatriz Malveiro</string-name>
          <email>beatriz.malveiro@feedzai.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Aparício</string-name>
          <email>david.aparicio@feedzai.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>João Tiago Ascensão</string-name>
          <email>joao.ascensao@feedzai.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Inês Silva</string-name>
          <email>ines.silva@feedzai.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pedro Bizarro</string-name>
          <email>pedro.bizarro@feedzai.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Reference Format:</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Feedzai</institution>
          ,
          <addr-line>Lisbon</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Feedzai</institution>
          ,
          <addr-line>Porto</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Xavier Fontes</institution>
          ,
          <addr-line>David Aparício, Maria Inês Silva, Beatriz Malveiro, João, Tiago Ascensão</addr-line>
          ,
          <institution>and Pedro Bizarro. Finding NeMo: Fishing in banking, networks using network motifs. In the 2nd Workshop on Search</institution>
          ,
          <addr-line>Exploration, and Analysis in Heterogeneous Datastores (SEA Data 2021).</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Banking fraud causes billion-dollar losses for banks worldwide. In fraud detection, graphs help understand complex transaction patterns and discovering new fraud schemes. This work explores graph patterns in a real-world transaction dataset by extracting and analyzing its network motifs. Since banking graphs are heterogeneous, we focus on heterogeneous network motifs. Additionally, we propose a novel network randomization process that generates valid banking graphs. From our exploratory analysis, we conclude that network motifs extract insightful and interpretable patterns.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Payment information theft renders online transactions susceptible
to fraud. Once detected, fraud entails a reimbursement from the
cardholder’s bank, leading to monetary costs to financial
institutions and customer friction.</p>
      <p>
        Fraud detection thus requires a deep understanding of the
underlying fraud patterns, and graphs ofer an intuitive way to
visualize these patterns. One can further leverage graph mining in
transaction data to understand fraud schemes in a wide range of
applications. Hajdu and Krész [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] developed a methodology to
identify cycles in transaction networks as a means to detect fraudulent
expenses. Micale et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] retrieved the most frequent patterns in a
relationship network of people involved in the Panama papers to
identify the most relevant money laundering structures.
      </p>
      <p>
        In this work, we build networks from a real-world banking
dataset of card purchases and apply a widely used graph mining
tool – heterogeneous network motifs [
        <xref ref-type="bibr" rid="ref10 ref7">7, 10</xref>
        ]. Analyzing recurring
patterns in real banking networks sets a foundation for
understanding how fraud materializes in transaction data. From there, one can
Copyright © 2021 for the individual papers by the papers’ authors. Copyright © 2021
for the volume as a collection by its editors. This volume and its papers are published
under the Creative Commons License Attribution 4.0 International (CC BY 4.0).
Published in the Proceedings of the 2nd Workshop on Search, Exploration, and
Analysis in Heterogeneous Datastores, co-located with VLDB 2021 (August 16-20, 2021,
Copenhagen, Denmark) on CEUR-WS.org.
extract insights about how legitimate and fraudulent transactions
"behave" and aid both fraud detection systems and fraud analysts.
      </p>
      <p>To the best of our knowledge, this work is the first to find and
explore heterogeneous network motifs in a real-world banking
setting. Notably, we have two main contributions from this work:
• We propose a randomization process (i.e., a null model)
adequate to banking datasets, a vital component for the
definition of network motifs used to provide the baseline
frequencies of each subgraph (detailed in Section 2.2.1).
• We extract network motifs from graphs built from card
transaction data and review them thoroughly. We include an
analysis of how the motif significance evolves as more random
networks are used (detailed in Section 3).</p>
      <p>The remainder of this paper is organized as follows. Section 2
presents our method and discusses the key components necessary
for our analysis. We describe the data and present results in
Section 3. Usage scenarios are proposed in Section 4. We put forward
our main takeaways and ofer directions for future work in
Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>METHOD</title>
      <p>In this section we present our methodological choices. First, we
discuss how we build networks from banking datasets (Section 2.1).
Then, we introduce heterogeneous network motifs and describe
how to compute and identify these motifs in banking datasets,
with a special focus on the null model and the measure of motif
significance. (Section 2.2)
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Graph representation</title>
      <p>Banking datasets usually consist of transactions between entities,
such as people, merchants, and businesses. They can include
diferent types of transactions such as card payments or bank transfers.</p>
      <p>From a banking dataset, we can build two graph representations,
namely (a) entity graphs and (b) transaction graphs, as illustrated
in Figure 1.
2.1.1 Entity graph. In an entity graph,  = ( , ), nodes 
represent entities, such as merchants or clients, and edges  connect
entities with at least one shared transaction. This way of
representing banking datasets is helpful to highlight suspicious entity
behavior.</p>
      <p>Transaction
Entity
Graph</p>
      <p>(a)
C1</p>
      <p>M1</p>
      <p>Consider, for example, the entity graph in Figure 1 (a). There,
we represent C1 and M1 as two connected nodes of diferent types,
i.e., {1, 1} ∈  , (1, 1) ∈ , and  (1) ≠  (1), where
 () is the type of node . Since we connect all  entities in a
transaction, they form a -node clique in the graph. When the same
two entities are parties to multiple transactions (e.g., a person makes
several purchases at a retail store), we aggregate the transaction
information into a single edge (,  ) ∈ .</p>
      <p>Therefore, an entity graph is undirected and heterogeneous. The
node label  () corresponds to the entity’s type (i.e., client, card,
merchant, or terminal). The edge label  (,  ) is binary, indicating
whether there is at least one fraudulent transaction involving the
two entities.
2.1.2 Transaction graph. In a transaction graph,  = ( , ), nodes
 represent individual transactions, and edges  connect
transactions that share entities.</p>
      <p>In the transaction graph from Figure 1 (b), transactions T1 and
T2 are represented as two nodes connected by an edge indicating
that the same client made both, i.e., { 1,  2} ∈  and ( 1,  2) ∈ .
Since connecting all transactions with common entities would result
in very dense graphs, we only connect transactions that occurred
within a time window, e.g., transactions made by a client in less than
24 hours. Moreover, we use edge direction to encode the temporal
sequence of transactions, with edges connecting older transactions
to more recent ones, i.e., if (,  ) ∈ , then  is more recent than .
If two transactions occur in the same timestamp, we add a
bidirectional edge between the two nodes, i.e., {(,  ), (, )} ∈ .</p>
      <p>Thus, a transaction graph is directed and heterogeneous. The
node label is binary (the transaction is fraudulent or legitimate),
i.e.,  ( ) ∈ { ,  }. The edge label is one of 2 − 1 possible
labels, all the combinations of sharing one or more of  diferent
entities, i..e,  (,  ) ∈ , | | = 2 − 1. In Figure 1,  = 2, hence
there are 3 possible labels for edges,  = 3.</p>
      <p>
        Transaction can be used to investigate patterns between
transactions. Additionally, machine learning classification models can
benefit from receiving topological information (including centrality
measures or node embeddings) extracted from transaction graphs
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Heterogeneous network motifs</title>
      <p>
        A network motif is a subgraph that appears more frequently than
expected. The concept of appearing more frequently than expected
commonly relies on building a large set of randomized networks
R that are similar to the original network [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In the literature,
authors typically use either 100 or 1000 randomized networks [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Suppose the frequency of a given subgraph  in the original
network is, according to some significance measure (Section 2.2.2),
much higher than the (average) frequency on similar randomized
networks, i.e.,   (,  ) &gt;&gt; _  (, R ). In that case,
subgraph  is a network motif. Similarly, if the subgraph’s
frequency is much lower than the (average) frequency on similar
randomized networks, that subgraph is an anti-motif. In this work, we
are interested in both motifs and anti-motifs.</p>
      <p>
        Motif discovery involves computing the frequency of a given set
of subgraphs and entails subgraph counting [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Subgraph counting
receives as input a graph G and a list of non-isomorphic subgraph
types (e.g., all possible unique subgraphs with four nodes). Then,
it outputs the frequencies of each subgraph type in G, e.g., the
frequencies of 4-node cliques, 4-node chains, or 4-node stars.
      </p>
      <p>
        In this work, we use g-tries for subgraph counting since they are
a general framework able to count subgraphs of arbitrary size in
heterogenous graphs [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. Other approaches are faster for counting
(a) Homogeneous
(b) Heterogeneous Nodes
(c) Heterogeneous Edges
(d) Heterogeneous
Nodes and Edges
specific subgraphs, but, as far as we know, g-tries are the only
method that support directed and heterogenous graphs [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Since our graphs are heterogeneous, we need network motifs
that consider node and edge heterogeneity. Concretely, we need to
extract heterogeneous motifs and anti-motifs.</p>
      <p>As an illustrative example, consider a 3-node clique where nodes
and edges are homogeneous (Figure 2 (a)) and the following
heterogeneous graph settings: (i) if nodes can be of two diferent types,
there are four possible 3-node cliques (Figure 2 (b)), (ii) if edges
can be of two diferent types, there are four possible 3-node cliques
(Figure 2 (c)). Thus, in our work, disregarding node or edge labels
undermines the necessary diferentiation of topological structures.
Heterogeneous motif discovery is more informative than traditional
homogeneous motif discovery and more complex to extract and
analyze.</p>
      <p>
        In banking fraud analysis, heterogeneous network motifs are
more helpful than homogeneous motifs. For instance, knowing that
"clients connected to many diferent cards are more likely to be
fraudulent" is arguably more informative than just knowing that
"dense subgraphs can be indicative of fraud".
2.2.1 Network randomization. Computing the expected frequency
of a given subgraph requires randomizing the original network
so that randomized networks are similar to the original. However,
defining network similarity is non-trivial and task-dependent. In
practice, the following two approaches are common:
• Initialize the randomized graph as a copy of the original
graph and iteratively swap random pairs of edges [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This
method preserves vertices’ in- and out-degrees.
• Initialize the randomized graph with the same nodes as the
original graph and incrementally add edges with
probabilities based on each nodes’ degree in the original graph [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
This method provides an approximation of the vertices’
inand out-degrees.
      </p>
      <p>However, these strategies are unsuitable to the banking fraud
domain as they fundamentally change the semantic structure of
banking graphs. For instance, in an entity graph, entities of the
same type are never directly connected by an edge. However, adding
or removing edges indiscriminately (while preserving each node’s
degree) will eventually lead to connecting nodes of the same type
and thus compromise the validity of the randomized network. Since
these strategies do not take node and edge labels into account, we
need to follow a diferent approach for network randomization.</p>
      <p>
        Instead of randomizing the original graph directly, we apply
a randomization procedure directly on the tabular data and then
build the randomized networks from the randomized tabular data.
Thus, the randomization procedure works as follows:
(1) Shufle all the fraud labels. This step maintains the fraud
rate of the original dataset but randomizes its attribution to
diferent transactions.
(2) Iterate over each entity type and, for each entity type, we
randomly chose  ∗  pairs of values to be swapped. Here,
 ∈ [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ] controls how much we want to randomize the
original network, and  is the number of transactions.
(3) Build the resulting graph from the randomized data, as
described in Sections 2.1.1 and 2.1.2.
      </p>
      <p>
        This network randomization strategy guarantees semantically
valid banking networks with a random topology.
2.2.2 Motif significance measures. After doing subgraph counting
on both the original network and the randomized networks, we use
a motif significance measure to evaluate which subgraphs are
overand under-represented. Several measures have been proposed [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],
but here we focus on two of the most well-established: the z-score
and the ratio.
      </p>
      <p>The z-score of a subgraph,  , is computed as follows:
Where,
 =  −</p>
      <p>•  is the frequency of subgraph  in the original network.
•  and  are the mean and standard deviation of the
frequencies of subgraph  in the randomized networks,
respectively.</p>
      <p>We compute the ratio of a subgraph,  , as follows:</p>
      <p>=</p>
      <p>Our goal is to find the subgraphs with the highest z-scores/ratio
(i.e., the motifs) and subgraphs with the lowest z-scores/ratio (i.e.,
the anti-motifs).</p>
      <p>In our experiments (Section 3), we show the ratio of the
subgraphs since it is more interpretable than the z-score: if  = 100,
the subgraph appears 100 times more often in the original network
than in the randomized networks. We complement our analysis by
reporting  ,  , and  .
(1)
(2)
3
3.1</p>
    </sec>
    <sec id="sec-5">
      <title>RESULTS</title>
    </sec>
    <sec id="sec-6">
      <title>Data overview</title>
      <p>Table 1 outlines the parameters used to generate the entity graph
and the transaction graph and provides summary statistics on the
two graphs.</p>
      <p>In the entity graph, we consider four node types (i.e., client,
merchant, terminal, and card) and two edge types (i.e., fraudulent
or legitimate).</p>
      <p>In the transaction graph, we consider two node types (i.e.,
fraudulent and legitimate) and three edge types (i.e., only client, only
merchant, or both client and merchant). The rationale for choosing
these two entities lies in the close relationship between clients and
cards and, similarly, merchants and terminals. In other words, since
most clients use few cards and most merchants have few terminals,
we simplify the graph by dropping the card and terminal entities.</p>
      <p>Additionally, we found it necessary to constrain the considered
time window due to computational constraints on subgraph
counting.</p>
      <p>Both graphs have hundreds of thousands of edges and many
diferent connected components. The fraud rate in the entity graph
is the number of fraudulent edges, while, in the transaction graph,
the fraud rate is the number of fraudulent nodes. As a result, the
fraud rates difer depending on the graph type.
3.2</p>
    </sec>
    <sec id="sec-7">
      <title>Motif analysis</title>
      <p>In this subsection, we start with the entity graph results and then
show the results of the transaction graph.</p>
      <p>We analyze which subgraphs are motifs and anti-motifs based
on the ratio (Equation 2): we consider subgraphs that have a
considerably higher ratio than the others to be motifs, and subgraphs
with the lowest ratios to be anti-motifs.</p>
      <p>We focus on subgraphs of size three due to the higher
computational cost of larger subgraphs.
3.2.1 Entity graph. From Figure 3 we can see that, for most cases,
the ratio of all subgraphs stabilizes relatively quickly at ≈ 40
random networks. We observe that a few subgraphs have significantly
higher ratio than the others ( &gt; 100). Moreover, a few subgraphs</p>
      <p>Number
have significantly lower ratio than the others (  ≈ 0.01). We
consider the first to be the motifs, and the last to be the anti-motifs,
shown in Figure 4 (a) and Figure 4 (b), respectively.</p>
      <p>From Figure 4 we notice that none of the subgraphs is a 3-node
clique. This result might seem counter-intuitive as all transactions
form a 4-node clique. A possible explanation is that some merchants
are hub nodes and thus induce chain-like subgraphs. Another
general observation is that all edges in the motifs are fraudulent, which
is not true for the anti-motifs. Domain knowledge indicates that
fraudulent transactions tend to be closely connected.</p>
      <p>The first three motifs from Figure 4 (a) have a client at its center,
connected to either (i) two terminals, (ii) one terminal and a
merchant, or (iii) two merchants. Subgraphs (i) and (iii) tell us that the
client made two fraudulent transactions at diferent merchants and
terminals, respectively. These two subgraphs are motifs because
fraud is sometimes recurring for fraudulent clients, and this
information is lost in the randomized networks since they reshufle fraud
labels. Subgraph (ii) tells us a similar story but notice that, since the
merchant and the terminal are not connected, the merchant and the
terminal are from diferent transactions. The last three motifs are
equivalent to the first three but with the card at the center of the
subgraph. Since clients can use multiple cards, the original network
counts are lower for subgraphs with the card at the center of the
chain than the counts of the subgraphs with the client at the center
of the chain. We also note that, in practice, it might be interesting
to investigate all cards used by the client.</p>
      <p>The first three anti-motifs from Figure 4 (b) have a merchant
at its center. All three anti-motifs convey the same information:
merchants typically either have only legitimate transactions or only
fraud transactions, i.e., fraud tends to cluster around the same risky
merchants. In the randomized networks, since we randomly swap
edges, these relations are lost. Finally, the last anti-motif tells us
that it is not common for clients to use multiple cards in legitimate
transactions. Indeed, clients that use multiple cards are typically
associated with fraudulent activity.
3.2.2 Transaction graph. Figure 5 shows the evolution of the ratio
for each 3-node subgraph in the transaction graph. After 80 random
networks, the ratios seem to stabilize, and we can clearly distinguish
ifve subgraphs that stand out, namely, the top-4 subgraphs and the
bottom-1 subgraph. These subgraphs are presented in Figure 6.</p>
      <p>The common feature in the four motifs is at least two transactions
that share the same client and merchant. We may lose this pattern
during the randomization since we swap merchants and clients
independently. However, it is still interesting to observe a significant
number of patterns where the same client makes two or three
transactions in the same merchant in less than six hours.
10,000
1,000
100
10
1
0.1</p>
      <p>Motifs</p>
      <p>Number</p>
      <p>Three of the four top motifs are a 3-node clique, which happens
when three transactions that share a merchant or client occur within
the same six-hour window. Once again, the randomization of the
entities breaks such patterns.</p>
      <p>It is important to note that some motifs contain transactions
processed in the same timestamp (represented with bi-directional
edges). This pattern can happen in the same merchant when the
merchant is processing transactions in a batch or has multiple
terminals.</p>
      <p>The only anti-motif is a sequence of three transactions in the
same merchant, where the middle transaction is fraudulent, and the
other two are legitimate. This pattern is less frequent in the original
network than in the randomized networks since fraud transactions
typically occur together in reality, and the randomized networks
break this pattern.</p>
    </sec>
    <sec id="sec-8">
      <title>4 USAGE SCENARIOS</title>
      <p>The motif analysis can be a tool to characterize banking datasets.
Beyond summary statistics, the list of motifs and anti-motifs surfaces
underlying fraud patterns.</p>
      <p>Fraud experts, who may not be knowledgeable in data science or
statistics, often use graph visualization for data exploration. Motif
analysis serves as a visual summary characterization of a dataset
for fraud detection.</p>
      <p>As an illustrative example, let us consider a fraud expert at a
bank. Their role entails designing new rules to prevent fraud and
reviewing unlabelled transactions. The expert can review the
characteristic patterns surfaced by motif analysis to tailor the fraud
detection system in place. Over time, fraudsters design new schemes
to evade detection. Periodic analysis of motifs surfaces upcoming
fraud schemes and overall behavioral trends.
1759 / 0.14 ± 0.99
2507 / 0.30 ± 0.54
2,223
289 / 0.13 ± 0.34</p>
      <p>1,567</p>
      <p>When reviewing an unlabelled transaction, the expert can
compare the respective subgraph with known motifs. This context can
give insight into which pattern, fraudulent or legitimate, might
occur.</p>
      <p>On the other hand, by extracting the most relevant transaction
patterns and uncovering new fraud schemes, motif analysis can
complement common fraud detection systems. These insights can
be used to improve both machine learning models and rule-based
systems.
5</p>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSIONS</title>
      <p>We explore motifs and anti-motifs in the context of banking fraud
using two graph representations, namely entity graphs and
transaction graphs. We propose a novel randomization method that
operates directly on tabular data. This way, we overcome the
limitations of current network randomization methods in the context
of banking fraud.</p>
      <p>Moreover, we extract heterogeneous network motifs that convey
more information than traditional network motifs and find they
ofer interpretable results. Insights extracted from motif analysis can
be used to aid fraud analyst investigate specific cases and improve
fraud detection systems by uncovering new fraud schemes.</p>
      <p>
        As future work, one can investigate whether diferent banking
datasets have similar motifs (and anti-motifs) and if those patterns
are diferent in merchant datasets. This research would follow the
ifndings by Milo et al . [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] where they report that networks with
similar contexts have similar subgraph patterns. One can also
extend the analysis to larger motif sizes, diferent temporal windows,
and the inclusion of transaction amounts or fraud labels as graph
properties.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>László</given-names>
            <surname>Hajdu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Miklós</given-names>
            <surname>Krész</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Temporal network analytics for fraud detection in the banking sector</article-title>
          .
          <source>In ADBIS, TPDL and EDA 2020 Common Workshops</source>
          . Springer,
          <fpage>145</fpage>
          -
          <lpage>157</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Giovanni</given-names>
            <surname>Micale</surname>
          </string-name>
          , Alfredo Pulvirenti, Alfredo Ferro, Rosalba Giugno, and
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Shasha</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Fast methods for finding significant motifs on labelled multirelational networks</article-title>
          .
          <source>Journal of Complex Networks</source>
          <volume>7</volume>
          ,
          <issue>6</issue>
          (
          <year>2019</year>
          ),
          <fpage>817</fpage>
          -
          <lpage>837</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Ron</given-names>
            <surname>Milo</surname>
          </string-name>
          , Shalev Itzkovitz, Nadav Kashtan, Reuven Levitt, Shai Shen-Orr, Inbal Ayzenshtat, Michal Shefer, and
          <string-name>
            <given-names>Uri</given-names>
            <surname>Alon</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Superfamilies of evolved and designed networks</article-title>
          .
          <source>Science</source>
          <volume>303</volume>
          ,
          <issue>5663</issue>
          (
          <year>2004</year>
          ),
          <fpage>1538</fpage>
          -
          <lpage>1542</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Ron</given-names>
            <surname>Milo</surname>
          </string-name>
          , Shai Shen-Orr,
          <string-name>
            <given-names>Shalev</given-names>
            <surname>Itzkovitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N</given-names>
            <surname>Kashtan</surname>
          </string-name>
          ,
          <string-name>
            <surname>Dmitri Chklovskii</surname>
            , and
            <given-names>Uri</given-names>
          </string-name>
          <string-name>
            <surname>Alon</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Network Motifs: Simple Building Blocks of Complex Networks</article-title>
          .
          <source>Science</source>
          (New York, N.Y.)
          <volume>298</volume>
          (11
          <year>2002</year>
          ),
          <fpage>824</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Catarina</given-names>
            <surname>Oliveira</surname>
          </string-name>
          , João Torres, Maria Inês Silva, David Aparício, João Tiago Ascensão, and
          <string-name>
            <given-names>Pedro</given-names>
            <surname>Bizarro</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>GuiltyWalker: Distance to illicit nodes in the Bitcoin network</article-title>
          .
          <source>arXiv:2102</source>
          .05373 [cs.LG]
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Pedro</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          , Pedro Paredes,
          <source>Miguel EP Silva</source>
          , David Aparicio,
          <string-name>
            <given-names>and Fernando</given-names>
            <surname>Silva</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>A survey on subgraph counting: concepts, algorithms and applications to network motifs and graphlets</article-title>
          . arXiv preprint arXiv:
          <year>1910</year>
          .
          <volume>13011</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Pedro</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          and
          <string-name>
            <given-names>Fernando</given-names>
            <surname>Silva</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Discovering colored network motifs</article-title>
          . In Complex Networks V. Springer,
          <fpage>107</fpage>
          -
          <lpage>118</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Pedro</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          and
          <string-name>
            <given-names>Fernando</given-names>
            <surname>Silva</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>G-Tries: a data structure for storing and ifnding subgraphs</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          <volume>28</volume>
          ,
          <issue>2</issue>
          (
          <year>2014</year>
          ),
          <fpage>337</fpage>
          -
          <lpage>377</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Pedro</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          , Fernando Silva, and
          <string-name>
            <given-names>Marcus</given-names>
            <surname>Kaiser</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Strategies for network motifs discovery</article-title>
          .
          <source>In 2009 Fifth IEEE International Conference on e-Science. IEEE</source>
          ,
          <fpage>80</fpage>
          -
          <lpage>87</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Ryan</surname>
            <given-names>A Rossi</given-names>
          </string-name>
          ,
          <article-title>Nesreen</article-title>
          K Ahmed,
          <string-name>
            <surname>Aldo Carranza</surname>
            , David Arbour,
            <given-names>Anup</given-names>
          </string-name>
          <string-name>
            <surname>Rao</surname>
            , Sungchul Kim, and
            <given-names>Eunyee</given-names>
          </string-name>
          <string-name>
            <surname>Koh</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Heterogeneous network motifs</article-title>
          . arXiv preprint arXiv:
          <year>1901</year>
          .
          <volume>10026</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Xu</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>A Survey of Measures for Network Motifs</article-title>
          .
          <source>IEEE Access</source>
          <volume>7</volume>
          (
          <year>2019</year>
          ),
          <fpage>106576</fpage>
          -
          <lpage>106587</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>