<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Monotone Sampling of Networks?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tim Grube</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benjamin Schiller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thorsten Strufe</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Technische Universita ̈t Darmstadt</institution>
          ,
          <addr-line>Hochschulstraße 10, 64289 Darmstadt</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Technische Universita ̈t Dresden</institution>
          ,
          <addr-line>No ̈thnitzer Straße 46, 01187 Dresden</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Determining the graph-theoretic properties of large real-world networks like social, computer, and biological networks, is a challenging task. Many of those networks are too large to be processed eciently and some are not even available in their entirety. In order to reduce the size of available data or collect a sample of an existing network, several sampling algorithms were developed. They aim to produce samples whose properties are close to the original network. It is unclear what sample size is sucient to obtain a sample whose properties can be used to estimate those of the original network. This estimation requires sampling algorithms that produce results that converge smoothly to the original properties since estimations based on unsteady data are unreliable. Consequently, we evaluate the monotonicity of sampled properties while increasing the sample size. We provide a ranking of common sampling algorithms based on their monotonicity of relevant network properties using the results from four nework classes.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Today’s networks are quite large, in many cases too large to understand the
network or to compute its properties. We have to reduce the complexity and therefore
the size of networks to use the network for analyses and research. We can reduce
the size by using graph coarsening or sampling techniques. Graph coarsening and
some sampling techniques require the availability of the complete network. This
constraint is rarely satisfied. Sampling by exploration allows to gain knowledge
about the unavailable network, but it usually distorts properties as the sampling
process can be biased. There are two large classes of sampling algorithms, we can
sample using a breadth first sampling (BFS) approach, constructing the sample
from the local area first, or we can use a random walk (RW) approach, traversing
along random paths of nodes and constructing the sample with nodes from deeper
areas of the network. The convergence behavior of network properties like the
degree distribution depends highly on the underlying network and the used sampling
algorithm.</p>
      <p>Many work has been done to overcome these biases and many specialized
algorithms were developed. These algorithms produce samples, whose properties
converge faster to the original networks properties, but as the properties of the original
network are typically unknown, it is undecidable whether the quality demands or
original properties are met.</p>
      <p>Our approach is another way to solve this problem. We are proposing a new
metric, which allows to develop an estimator for network properties in future work.
This estimator should deliver the properties of the original network if it gets the
network properties of the sample, the specification of the sampling algorithm, and
the assumed size of the original network. For the development of such an estimator,
we need the sampling algorithm to produce a sample with monotone converging
network properties. In this paper, we are investigating the convergence
monotonicity of the network properties on sampled networks.</p>
      <p>The rest of the paper is structured as follows: Section 2 presents the related work
about sampling and the evaluation of newly developed sampling algorithms. We
present in Section 3 the desired behavior of sampling algorithms. We present the
results and the discussion of our work in Section 4 and conclude with a summary
and outlook in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>The related work lists two typical classes of sampling techniques. The first one is
the deletion of nodes or edges. Node deletion techniques use the complete network
as basis and are deleting nodes until the network size is reduced to the desired
size. Edge deletion uses a similar technique, instead of removing nodes, these
algorithms remove edges and perhaps the attached nodes. The complete network has
to be available to apply these two techniques. The second technique is sampling
by exploration, using this technique, the sampling algorithm traverses from a start
node into the network and collect the nodes to the sample. This technique is
interesting for at least two reasons: First, it is easy to use by instrumenting crawlers.
Second, we do not have this dependency on the availability of the complete
network. Sampling by exploring is the common technique to reduce the complexity
of networks and gain an excerpt of the whole network.</p>
      <p>A lot of research has been done into the direction of developing more
sophisticated sampling algorithms in this area. These algorithms are developed to
produce samples, whose property values converge faster towards the property values
of the original network. The sampling algorithms can be classified into breadth
first approaches and random walk ones. Table 1 shows the classification, based on
the type of walking, and the abbreviations of the analyzed algorithms.</p>
      <sec id="sec-2-1">
        <title>Breadth First Sampling</title>
        <p>
          The BFS algorithms traverse through the network by focusing on the local
neighborhood first. The simplest implementation is a classical BFS which visits all the
neighbors of node. Krishnamurthy, Leskovec and Stutzbach [
          <xref ref-type="bibr" rid="ref13 ref4 ref7">4,7,13</xref>
          ] use the BFS
in their work. Goodman et al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] introduced snowball sampling (SS), which is
a variation of the BFS and visits only a specifiable number of new, still unseen
neighbors. Lee et al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] have evaluated this variation. Similar to SS,
respondentdriven sampling (RDS) is developed by Heckarthorn et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and analyzed by
Rasti and Kurant [
          <xref ref-type="bibr" rid="ref10 ref5">10,5</xref>
          ]. RDS visits a specifiable number of random neighbors,
ignoring whether these neighbors are already known. Forest fire (FF), introduced
by Leskovec et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], is the last BFS derivate. It collects all neighbors with a
certain probability into a walking queue.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Random Walk Sampling</title>
        <p>
          The second class is the one of the random walk algorithms. The simplest one is
the random walk sampling (RW). This algorithm traverses through the network
by exploring along random neighbors. This sampling algorithm is well studied,
e.g. by Stutzbach, Leskovec and Ribeiro [
          <xref ref-type="bibr" rid="ref11 ref13 ref7">13,7,11</xref>
          ]. Intuitively, and mathematically
proveable, the RW sampling is biased towards nodes with higher degrees in the
network. To overcome this bias, and to collect a representative sample of the
network, two methods of correcting the probability for the next node were
introduced. Stutzbach et al. [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] called their version random walk with degree correction
(RW-DC). Rasti et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] proposed a slightly di↵erent correction and named it
metropolized hastings random walk (RW-MH). Both approaches depend on the
degree of the potential next node on their exploration path.
        </p>
        <p>
          Stutzbach et al. [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] showed two further variations of the RW: The first one is the
random stroll (RS). This variation is moving like the RW but skips intermediate
nodes from adding them to the sample. The second algorithm is the combination
of RS and RW-DC. Random stroll with degree correction (RS-DC) skips
intermediate nodes like the RS and moves with a degree correction like the RW-DC.
Leskovec et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] introduced another variation of the RW. In particular, they
introduced a jump probability in the random jump (RJ). This probability allows
to move to a farther area of the network to avoid getting caught in a small area
of the network.
        </p>
        <p>
          Ribeiro et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] developed a variation of a random walk with multiple instances.
Since a simple parallel execution would su↵er from the same problems like the
classical RW, they introduced a dependency between the instances. Only one
instance, in the default setting the one with the highest node degree on the current
position, is allowed to move through the network. The active instance is picked
every round again. This sampling algorithm is called frontier sampling (FS).
Another algorithm, similar to the RW, is the depth first sampling (DFS). This
algorithm moves to the first neighbor and collects the remaining neighbors in a
queue. This queue a↵ects the behavior if the algorithm is getting caught. Even
though there are similarities between RW and DFS, the results are quite di↵erent
due to the impact of the queue in DFS. This algorithm is often used as a kind of
baseline, e.g. by Krishnamurthy and Leskovec [
          <xref ref-type="bibr" rid="ref4 ref7">4,7</xref>
          ].
We analyze the common network properties. The earlier introduced sampling
algorithms are partially evaluated with these metrics in their corresponding papers.
There are two types of properties: The first one is a single scalar value, for
example a floating-point number or an integer. The second type is a distribution,
which provides for each possible value a certain probability to find this value in
the network. We concentrate our work in this paper on the single scalar values.
        </p>
        <p>
          The assortativity coecient (AC) is a measure for the correlation of connected
nodes. We use the definition from Newman et al. [
          <xref ref-type="bibr" rid="ref8 ref9">9,8</xref>
          ]. We analyze the
clustering coecient (CC) in both characteristics, the transitivity which is for example
used by Chakrabarti et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and the average clustering coecient which is for
example used by Krishnamurthy et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. The degree distribution (DD) allows
the derivation of multiple single scalar values. We inspect the average node
degree, the average in-degree, the average out-degree and the maximum degree of
the networks. We are deriving multiple single scalar values from the shortest path
length distribution (SP), too. The characteristic path length is a measure for the
expected path length in the network, the diameter is the maximum shortest path
length in the network. As the diameter is prone to distortions, we are analyzing
the e↵ective diameter of the 90% quantile. This metric computes the maximum
shortest path length for the part of 90% connected nodes in the network. We use
the definition from Chakrabarti et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] in our implementation. By ignoring the
end of the heavy tail, we remove the sensitivity for deformed networks. Table 2
lists the used metrics, submetrics and abbreviations for the metrics.
We propose a new approach to circumvent the nescience of the original networks
properties. Instead of evaluating the sampling algorithms with respect to the speed
of convergence, we evaluate the monotonicity of the property convergence along
increasing sample sizes. The monotonicity is an important property to support
the development of estimators to project the properties of the sampled network
to the original network. We rank the well known and commonly used explorative
sampling algorithms with respect to the monotonicity properties of their samples.
        </p>
        <p>The looked-for algorithm produces smoothly approximating properties among
increasing sampling sizes. We count the direction changes of the property
convergence while sampling with increasing sample sizes. Figure 1 shows the aggregrated
AC of a webgraph, sampled 100 times with DFS (in blue) and RW (in red). The
original networks assortativity coecient is plotted as a green line. The AC
sampled with RW is obviously slower converging than the AC sampled with DFS, even
though, the smooth progression allows develop a simpler estimator for the original
property values. We are computing the monotonicity by comparing the property
Intuitively, a change of monotonicity is defined as a change in the direction of
convergence as in Eq. (2) at position i to i+1 . An intermediate equality of successive
property values does not cause a change in monotonicity.</p>
        <p>"1 ... "i #i+1 #i+2 ...
(2)
A good sampling algorithm approximates the inspected properties with as few
changes in the monotonicity progress as possible. This property definition of a
good sampling algorithm is also intuitive as it matches the perception of a good
approximation.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Evaluation</title>
      <p>
        We used GTNA [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to implement and compute the sampling process. GTNA
allows to integrate the network generation, the application of the sampling
algorithms and the computation of the graph-theoretic metrics. We initialized the
sampling algorithms with random start nodes, and executed 100 runs for the
networks with  500, 000 nodes and 20 runs for the networks with &gt; 500, 000 nodes
to reduce the impact of randomness in the sampling process. We sampled 1%
10% in 1% steps and 15% on all networks, networks with  500, 000 nodes are
also sampled with 20% and 25%.
      </p>
      <p>We selected networks based on the related work, which is presented in Section
2. The analyzed networks are listed in Table 3. The networks are available at the
SNAP project3. We identified four groups of networks based on their network type:
social networks in a directed and undirected form, directed p2p networks and a
directed webgraph. We calculated the earlier presented network properties (AC,
CC, DD, and SP) to provide a useful evaluation for the property monotonicity of
3 Stanford Large Network Dataset Collection, available at http://snap.stanford.edu/data/
the sampled networks. We show the results in Table 4. The sampling algorithms
are ranked according to their monotonicity properties. We ignored the
transitivity in this evaluation, since all algorithms perform very good with respect to the
monotonicity of the progression of this metrics values. As introduced in Section 3,
we are concentrating on single scalar properties, and therefore we are deriving the
submetrics as listed in Table 2. These single scalar values are combined as in Eq.
(3) to have a single value per property. To avoid an overweighting of one of these
submetrics, we are normalizing by the number of submetrics.</p>
      <p>SP =
(cpl + diam + ee↵ctiveDiam)
3
; DD =
(avg + avgin + avgout + max)
4
(3)
We expect to provide a recommendation of sampling algorithms for the complete
network class. Therefore, we sum up the single network properties of each network
instance in a group to have a cumulated monoticitiy value per network property. To
be able to compare these values over the borders of a group, we are normalizing
the values by the number of networks within the group. This is formulated in
Eq. (4) (group represents the network groups, listed in Table 3). The function
M returns the value of a certain metric, computed on the given network. Mgroup
collects the normalized, added values of the analyzed metrics. M is a placeholder
for the computed metrics, listed in Table 2.</p>
      <p>Mgroup =
1</p>
      <p>X
|group| nw2 group</p>
      <p>M (nw) ; M 2 { AC, CC, DD, SP }
To be able to provide a recommendation not only on the basis of a single property
but for a monotone sampling of the complete network, we cumulate the values of
the single network properties of a group and are normalizing them by the number
of metrics. This is shown in Eq. (5) and described by ⌫ .</p>
      <p>⌫ group =</p>
      <p>ACgroup + CCgroup + DDgroup + SPgroup
4
The Uniform Sampling (US), which is a random node selection algorithm, is used
as ground truth. This algorithm is not practicable in real world applications, but
as it produces a real random sample it is a good baseline to compare the
monotonicity of the analyzed algorithms with. We are providing in Table 4 the by ⌫
sorted ranking of sampling algorithms. Table 4 shows the domination of the
random walk algorithms on the directed social network and the webgraph, the BFS
algorithms are dominating the undirected social network and the P2P networks.
An interesting fact is the stable presence of the US in the upper half of the
ranking. Besides the webgraph, the simple algorithms are not far behind the forefront.
There is typically either BFS or RW within the best five sampling algorithms. We
(4)
(5)
are showing in Figure 2(a) the plotted values of the results for the directed social
network. We have built groups for AC, CC, DD, and the SP, the last group is
showing the ⌫ -values of the column from Table 4. The AC is completely monotone
sampled by the RW and the RW-MH, the other algorithms are including changes
in the monotonicity, a high amount of changes is especially visible at the group of
BFS algorithms. The plot of the CC values showing similar results, the advantage
of the random walk group is even higher, besides the DFS, all algorithms are
better than the BFS algorithms. The monotonicity analysis of the DD metric shows
similar monotonicity values for all sampling algorithms. The advantage of the RW
algorithms is not distinctly present. The SP metric is well preserved by the BFS,
the advanced algorithms of the BFS group and the group of RW algorithm
produce similar monotonicity values. The advantage of the BFS is intuitive, as the
shortest path properties are constantly converging by extending the exploration of
the direct neighborhood in rings. The RW algorithms are traversing into the deep
of the network and are usually producing longer paths. Beside the SP property, the
results for the webgraph are similar. Figure 2(d) shows the plot for the webgraph.
The advantage of the BFS on the SP property is not present on the webgraph.
The undirected social network, shown in Figure 2(b), is similarly sampled with
all algorithms. The monotonicity of all samples is similar for the combined metrics
of the network, DD and SP. The CC has a negative outlier with RW-DC which
is not monotonously converging. The AC has two positive outliers BFS and RDS,
which produce very monotone AC values.</p>
      <p>The P2P network, shown in Figure 2(c), is the only network with di↵erent
monotonicity results, as shown in Table 4. The advantage of the RW algorithms is not
present. The AC is dominated by BFS and FF, while the CC is dominated by BFS,
RS and RS-DC. The DD is nearly equal monotone for all sampling algorithms, only
the BFS has a change in the monotonicity but these values can be neglected with a
value of 0.25. The SP property is slightly dominated by RW algorithms. Combined
to the ⌫ value, the algorithms perform with similar monotonicity for the complete
network. There is no predominant algorithm for this group of P2P networks.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>Today’s networks are large, often too large to understand and process them
directly. The computation of graph-theoretical properties on these large networks is
a challenging task. We need to reduce the networks complexity and therefor the
size of the network. The main technique for achieving this reduction of complexity
is sampling by exploration. These sampling algorithms traverse through the
network and collect the sample. Due to the network structure, some algorithms distort
the networks properties. Many improved sampling algorithms were introduced to
overcome these biased sampling processes. The properties of the sampled network
are highly depending on the underlying network and the used sampling algorithm.
As the original networks properties are mostly unknown, we are not able to
compare the sample properties with them. Therefore, it is undecidable if or when the
quality demands are met. We propose another way to overcome this problem. We
evaluate the convergence monotonicity to support the development of an estimator
for the common original network properties. The common properties are e.g. the
degree distribution, and the shortest path distribution. To be able to provide a
useful monotonicity evaluation, we chose networks based on the related work, which
analyzes the newly developed sampling algorithms. To evaluate the convergence
monotonicity, we sampled multiple times and compute the properties of the
samples. The convergence along increasing sample sizes is collected and aggregated.
We rank the sampling algorithms with respect to the monotonicity values of their
samples.</p>
      <p>The main results of our evaluation are as follows: the complex algorithms
enhancing the simple basic algorithms are not necessarily better in our monotonicity
metric. Moreover, the simple algorithms random walk and breadth first sampling
are the best algorithms of their group or at least at the forefront of their groups.
The random walk algorithms are typically outperforming their breadth first
sampling counterparts. The breadth first sampling algorithms are only as good as the
random walk algorithms on the P2P and undirected social networks. The shortest
path properties are well preserved by the breadth first sampling, they are inferior
(a)
Fig. 2. Results of the monotonicity of the analyzed sampling algorithms: (a) directed social networks,
(b) undirected social networks, (c) p2p networks, (d) webgraphs.</p>
      <p>compared to the random walk ones on the webgraph and the P2P networks only.
An interesting fact is that the monotonicity of the uniform sampling which is used
as ground truth: The uniform sampling is never the most monotone algorithm but
is in the better half of the ranking on all networks.</p>
      <p>In the future work, we will develop a metric to measure the monotonicity of
network properties which are not describable with a single scalar value, but with a
distribution. The open question regarding this metric is: What is a monotonely
converging distribution and how to measure this monotonicity? After answering
this question, we want to develop estimators to assess the properties of the original
network by analyzing sampled networks.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Chakrabarti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Faloutsos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>Graph Mining : Laws , Generators , and Algorithms</article-title>
          .
          <source>ACM Computing Surveys</source>
          <volume>38</volume>
          ,
          <string-name>
            <surname>March</surname>
          </string-name>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Goodman</surname>
            ,
            <given-names>L. A. Snowball</given-names>
          </string-name>
          <string-name>
            <surname>Sampling</surname>
          </string-name>
          .
          <source>The Annals of Mathematical Statistics</source>
          <volume>32</volume>
          ,
          <issue>1</issue>
          (Mar.
          <year>1961</year>
          ),
          <fpage>148</fpage>
          -
          <lpage>170</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Heckathorn</surname>
          </string-name>
          , D. D.
          <article-title>Respondent-driven sampling: a new approach to the study of hidden populations</article-title>
          .
          <source>Social problems 44</source>
          (
          <year>1997</year>
          ),
          <fpage>174</fpage>
          -
          <lpage>199</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Krishnamurthy</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faloutsos</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Tauro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Sampling Internet Topologies : How Small Can We Go</surname>
          </string-name>
          ? In International Conference on Internet Computing (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kurant</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markopoulou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Thiran</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <article-title>On the bias of BFS (Breadth First Search)</article-title>
          .
          <source>In 2010 22nd International Teletrac Congress (lTC 22)</source>
          (Sept.
          <year>2010</year>
          ), IEEE, pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
          </string-name>
          , P.-J., and
          <string-name>
            <surname>Jeong</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <article-title>Statistical properties of sampled networks</article-title>
          .
          <source>Physical Review E 73</source>
          ,
          <issue>1</issue>
          (Jan.
          <year>2006</year>
          ),
          <fpage>016102</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Leskovec</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Faloutsos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>Sampling from large graphs</article-title>
          .
          <source>In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          (New York, NY, USA,
          <year>2006</year>
          ),
          <source>KDD '06</source>
          , ACM, pp.
          <fpage>631</fpage>
          -
          <lpage>636</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Newman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>Mixing patterns in networks</article-title>
          .
          <source>Physical Review E 67</source>
          ,
          <issue>2</issue>
          (Feb.
          <year>2003</year>
          ),
          <fpage>026126</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Newman</surname>
            ,
            <given-names>M. J. Assortative</given-names>
          </string-name>
          <article-title>Mixing in Networks</article-title>
          .
          <source>Physical Review Letters</source>
          <volume>89</volume>
          ,
          <issue>20</issue>
          (Oct.
          <year>2002</year>
          ),
          <fpage>208701</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Rasti</surname>
            ,
            <given-names>A. H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Torkjazi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rejaie</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stutzbach</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duffield</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Willinger</surname>
            ,
            <given-names>W. Evaluating</given-names>
          </string-name>
          <article-title>Sampling Techniques for Large Dynamic Graphs</article-title>
          .
          <source>Technical Report CIS-TR-08-01</source>
          , Department of Computer and Information Science, University of Oregon, http://mirage.cs.uoregon.edu/pub/tr08-
          <fpage>01</fpage>
          .pdf, Sept.
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Ribeiro</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Towsley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>Estimating and sampling graphs with multidimensional random walks</article-title>
          .
          <source>In Proceedings of the 10th annual conference on Internet measurement - IMC '10</source>
          (New York, New York, USA, Nov.
          <year>2010</year>
          ), ACM Press, p.
          <fpage>390</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Schiller</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bradler</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schweizer</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <article-title>Mu¨hlha¨user, M., and</article-title>
          <string-name>
            <surname>Strufe</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>GTNA: a framework for the graph-theoretic network analysis</article-title>
          .
          <source>In Proceedings of the 2010 Spring Simulation Multiconference</source>
          (San Diego, CA, USA,
          <year>2010</year>
          ), SpringSim '10,
          <string-name>
            <surname>Society</surname>
          </string-name>
          for Computer Simulation International, pp.
          <volume>111</volume>
          :
          <fpage>1</fpage>
          --
          <lpage>111</lpage>
          :
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Stutzbach</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rejaie</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duffield</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Willinger</surname>
            ,
            <given-names>W. Sampling</given-names>
          </string-name>
          <article-title>Techniques for Large, Dynamic Graphs</article-title>
          .
          <source>In INFOCOM 2006. 25th IEEE International Conference on Computer Communications. Proceedings (Apr</source>
          .
          <year>2006</year>
          ), Ieee, pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>