=Paper= {{Paper |id=None |storemode=property |title=Stream-based Community Discovery via Relational Hypergraph Factorization on Evolving Networks |pdfUrl=https://ceur-ws.org/Vol-655/dynak2010_paper4.pdf |volume=Vol-655 |dblpUrl=https://dblp.org/rec/conf/pkdd/BockermannJ10 }} ==Stream-based Community Discovery via Relational Hypergraph Factorization on Evolving Networks== https://ceur-ws.org/Vol-655/dynak2010_paper4.pdf
        Stream-based Community Discovery via
Relational Hypergraph Factorization on Evolving
                                    Networks


                   Christian Bockermann and Felix Jungermann


           Technical University of Dortmund, Articial Intelligence Group




       Abstract. The discovery of communities or interrelations in social net-
       works has become an important area of research. The increasing amount
       of information available in these networks and its decreasing life-time
       poses tight constraints on the information processing  storage of the
       data is often prohibited due to its sheer volume.
       In this paper we adapt a exible approach for community discovery of-
       fering the integration of new information into the model. The continuous
       integration is combined with a time-based weighting of the data allowing
       for disposing obsolete information from the model building process.
       We demonstrate the usefulness of our approach by applying it on the
       popular   Twitter network. The proposed solution can be directly fed with
       streaming data from   Twitter, providing an up-to-date community model.


1    Introduction

Social networks like Twitter or Facebook have recently gained a lot of interest
in data analysis. A social network basically consists of various types of entities
 such as users, keywords or resources  which are in some way related to one
another. A central question is often the discovery of groups of individuals within
such networks - the nding of communities. Thus, we are seeking for a clustering
of the set of entities into subsets where the individuals within each subset are
most similar to each other and are most dissimilar to the entities of all other
subsets. The similarity of entities is provided by their relations to one another.
    The relations between dierent entities are implied by the communication
taking place within the network. Users exchange messages, which contain ref-
erences to other users, are tagged with keywords or link to external resources
by means of URLs. Figure 1 shows a message from the Twitter platform, which
implies relations between the user yarapavan, the URL http://j.mp/fpga-mr
and the tag #ML.
    A natural perception of a social network is that of a connected graph, which
models each entity as a node and contains (weighted) edges between related
entities. Such a graph can be easily described by its adjacency matrix: with d
being the number of entities in our social network, we will end up with a (sparse)
                     2
matrix A of size d , where Ai,j = w if entity i is related to j with weight w and
0 otherwise. However this representation is not well-suited for n-ary relations.
                       Fig. 1. Example tweets of the            Twitter platform


   A well-established representation of multi-dimensional relations is given by
tensors [1, 2, 5, 13, 9, 6, 15]. A tensor is a multi-way array and can be seen as a gen-
eralization of a matrix. Tensors have been successfully used in multi-dimensional
analysis and recently gained attention in social network mining [1, 2, 5]. In the
case of social networks, tensors can be used to describe n-ary relations by using
one tensor for each type of relations. Ternary relations of type (user,tag,url) can
then be described by a mode-3 tensor X with
                               
                                   w     if user i, tag j and url k are related
                   X i,j,k =
                                   0     otherwise.


More complex n-ary relations will be reected in tensors of mode-n.



Tensor based Community Discovery

Community discovery in such tensor representations is mapped to a decomposi-
                                                                  (i)
tion of the tensors into a product of matrices U                        ∈ Rmi ×k which approximates
the tensor                                          Y
                                          X ≈ [z]         ×di U (i) .
                                                      i
                                   (i)
Each of the matrices       U     in turn reects a mapping of entities to clusters
{1, . . . , k}. The [z] factor is a super-diagonal tensor which serves as a glue
element  see Section 3 for details. A variety of dierent decomposition tech-
niques such as Tucker3 or              Parafac (CP) has been previously proposed [3, 10,
7]. Approximation is commonly measured by some divergence function. In [5]
the authors proposed a clustering framework based on tensor decompositions
which has been generalized for Bregman divergences. In [4] Bader et al. used CP
tensor decomposition to detect and track informative discussions from the En-
ron email dataset by working on the ternary relation (term,author,time). These
approaches have been applied to decompose single tensors. In [12] the authors
introduced    MetaFac, which is a factorization of a set of tensors with shared
             (q)
factors (U         matrices). This allows for the discovery of one global clustering
based on multiple tensor descriptions of the data. The time complexity for these
tensor decompositions is generally given by the number of non-zero elements of
the tensors (provided that a sparse representation is used).



Stream-based Community Discovery The majority of the tensor decompo-
sition methods so far is based on a static data set. To incorporate streaming data,
the stream is broken down into blocks and the decompositions are re-computed
for each of the new blocks [12]. A common way to handle time is to introduce a
trade-o factor of the old data and the data contained in the new blocks.
    In [14] Sun et al. presented dynamic tensor analysis. They handle n-ary rela-
tions by tensor decomposition using stream-based approximations of correlation
matrices. They also presented a stream-based approach which is not really com-
parable to ours. They are processing a tensor containing data by unfolding the
tensors to every single mode and after that they are handling every column of
the resulting matrices in a stream to update their model. In reality, we can-
not assume such an original tensor to be given. In contrast to [14], we consider
multiple relations which have to be updated at each iteration instead of just one.



Contributions

The critical bottle-neck within the tensor decomposition methods often is their
runtime. As of [12], the runtime for a decomposition of a set of tensors can be
bound by O(N ), where N is the number of entries in all tensors. However, this
number can be rather large  we extracted about 590k entries (relations) from
200k messages of the Twitter platform.
    In this work, we present an adaption of the   MetaFac framework proposed
in [12]. Our contributions are as follows:

 1. We integrate a sampling strategy into the    MetaFac framework. Eectively
    we limit the maximum size of the tensors  and therefore N  and use a
    least-recently-used approach to replace old entities if the limit of an entity
    type exceeds.
 2. We introduce a time-based weighting for relations contained within the ten-
    sors. These weights will decrease over time, reecting the decreasing impor-
    tance of links within the social networks.
 3. We present an adaption of the     MetaFac factorization which allows for a
    continuous integration of new relations into the factorization model. Instead
    of running the optimization in a per-block mode, we provide a way to simul-
    taneously optimize the model while new data arrives.
 4. Finally, we provide an evaluation of our proposed adaptions on real-world
    data.

    The rest of this paper is structured as follows: Section 2 formalizes the prob-
lem and presents the   MetaFac approach on which this work is based. Following
that, we give an overview of tensor decomposition in Section 3 and provide the
basics for the multilinear algebra terminology required. In Section 4 we intro-
duce our stream-based adaption of the     MetaFac algorithm. We evaluated our
streaming approach on real world data (Section 5) and present our ndings in
Section 6.



2    Multi-Relational Graphs

As denoted above, a social network generally consists of a set of related entities.
In general, we are given sets V1 , . . . , Vk of entities of dierent types, such as
users, keywords or urls. Let Vi be the i-th type of entities, e.g. V1 corresponds
to users, V2 refers to keywords and so on. A relation then is a tuple of entities,
e.g. a user-keyword relation (u1 , k1 ) is an element of V1 × V2 . We also refer to
R := V1 × V2 as the relation type R of the relation (u1 , k1 ).
      The entities are given as strings, and we dene a mapping ϕi for each entity
type Vi , which maps entities to integers ϕi : Vi → {0, . . . , |Vi | − 1}. The mapping
ϕi can be some arbitrary bijective function. For some w ∈ Vi we refer to ϕi (w) as
                                                                             −1
the index of w . We denote the string of an entity given by its index j by ϕ    (j).
This allows us to identify each entity by its index and enables us to describe a
set of relations between entities by a tensor.
      A tensor X is a generalization of a matrix and can be seen as a higher-order
matrix. A mode-k tensor X       ∈ RI1 ×...×Ik is a schema with k dimensions where
X i1 ,...,ik ∈ R, ij ∈ Ij denotes the entry at position (i1 , . . . , ik ). For k = 2 this
directly corresponds to a simple matrix whereas k = 3 is a cube.
    With the mappings ϕi of entities and the tensor schema, a set of relations
                                                                          |V |···|Vil(i) |
X ⊆ Vi1 × . . . × Vil(i) can be dened as a mode-k tensor X ∈ R i1                         with


                                          1 if (ϕ−1                −1
                                      
                                                 1 (ν1 ), . . . , ϕl(i) (νl(i) )) ∈ X
                  X ν1 ,...,νl(i) =                                                        (1)
                                          0 otherwise
          −1
where ϕi       (νi ) denotes the mapping ϕi that corresponds to the i-th relation type,
and Vi1 × . . . × Vil(i) are the indexes of the entity types used in the relation i.



2.1     MetaGraph

Following the above approach for k = 2, we would be considering only binary
relations, which correspond to edges in the graph representation of the social
network. Thus the adjacency matrix for such a graph would be resembled within
a collection of mode-2 tensors.
      MetaGraph introduced by [12] is a relational hypergraph representing multi-
dimensional data in a network of entities. A MetaGraph is dened as a graph
G = (V, E), where each vertex corresponds to a set of entities of the same type
and each edge is dened as a hyper-edge connecting two or more vertices. By the
use of hyper-edges, the MetaGraph captures multi-dimensional relations of the
social network and therefore provides a framework to model n-ary relations.
    Given the notion of relation types dened above, each relation type Ri =
Vi1 × . . . × Vil(i) corresponds to a hyper-edge in the MetaGraph G. Each relation
type Ri = (vi1 , . . . , vil ) observed within the social network is reected in a hyper-
edge of the MetaGraph. Given a xed set of relation types R1 , . . . , Rn , we can
                                                                              (i)
model the occurrence of relations of type Ri by dening a Tensor X                for each
Ri as described in (1).
      This approach results in a description of the social network by means of dif-
ferent relational aspects R1 , . . . , Rn . Each type Ri of relations for which a tensor
is dened, reects a subset of all the relations of the network. Capturing the com-
                                                                                             2
plete set of relations among all entities would obviously result in |P(V )| = |V |
dierent tensors.
2.2     Community Discovery Problem

With the use of tensors we have an approximated description of our social net-
works by means of a set of relation types R1 , . . . , Rn . Thus we can describe our
network graph G by means of the data tensors which are dened according to
                                                                          n                     o
the observed relations R1 , . . . , Rn in G, i.e. G 7→                     X (1) , . . . , X (n) . Based on
this description we seek for a further partitioning of the tensor representation
into clusters of entities.
                                                                                               (i)
      The solution proposed in [12] is a factorization of the tensors X                              into prod-
                          (q)                                                                          (q)
ucts of matrices U              which share a global factor [z] and some of the U                            ma-
                 (i)
trices. Let X          be the tensor describing Vi1 × . . . × Vil(i) , then we can factorize
this as

                                                    l(i)
                                                    Y
                                      X (i) ≈ [z]          ×j U (ij ) .                                      (2)
                                                    j=1

Within this factorization, the [z] factor is a super-diagonal tensor containing
non-zero values only at positions    (i, i, . . . , i). The U (q) are R|Vq |×k matrices,
where |Vq | is the number of entities of the q -th entity type and k is the number
of communities we are looking for. For tensors which relate to relation types
with overlapping entity types (e.g. (user,keyword,tag) and (user,keyword,url))
                                                                           (q)                        user
the corresponding factorizations share the related U                             matrices (e.g. U            and
    keyword
U             ). The ×j is the mode-j product of a tensor with a matrix.
                                                                                        (q)
      With an appropriate normalization as used in [12], the U                                matrices only
contain values of [0, 1] which can be interpreted as probability values. Based on
                          (q)                                                             −1
this, the value of U l,m can be seen as the probability of entity ϕq                           (l) belonging
to cluster m ∈ {1, . . . , k} and we can simply map an entity to its cluster C(l) by

                                                                 (q)
                                      C(l) = arg max U l,m .                                                 (3)
                                                        m

      Thus, the community discovery is mapped onto the simultaneous factoriza-
tion of a set of tensors. The objective is to nd a factorized representation,
                                                                 (i)
which resembles the original data tensors {X                           } as closely as possible. Given
                                      I1 ×...×Il        I1 ×...×Il
some distance measure D : R                        ×R                  → R this leads to the following
optimization problem:

                                          n                    l(i)
                                          X                    Y
                                arg min        D(X (i) , [z]          ×j U (ij ) )                           (4)
                            [z],{U (q) } i=1                   j=1



3      MetaFac - Metagraph Factorization
As mentioned before, the Metagraph is a description of a multi-relational graph
G by means of a set of tensors {X (i) }. The objective of the MetaFac algo-
                                                  (i)                            (q)
rithm is to derive tensor decompositions of the X     with shared factors [z], U
                                             (i)
which closely resemble the X                       . To measure the approximation, [12] proposed
the Kullback Leibler divergence DKL , thus implying the following optimization
problem:

                                       n
                                       X                                Y
                        arg min              DKL (X (i) , [z]                      ×j U (ij ) )                        (5)
                        [z],{U (q) }   i=1                         i1 ,...,il(i)


To solve for (5) the authors derived an approximation scheme by dening

                                                                l(i)
                                                                Y
                           µ(i) = vec(X (i)              ([z]          ×j U (ij ) ))                                   (6)
                                                                j=1

                           S (i) = fold(µ(i) ∗ (z ∗ U Mi ∗ · · · ∗ U 1i )T )                                           (7)


where      is the elementwise division of tensors, and ∗ is the Khatri-Rao product
                                                                                                        (q)
of matrices. These values are then be used to update z and the {U                                             } iteratively
using

                                                    n
                                      1X
                                       z=   acc(S (i) , Mi + 1)                                                        (8)
                                      n i=1
                                       X
                                 Uq =       acc(S (i) , q, Me + 1)                                                     (9)
                                             l:el ∼vq

where acc is the accumulation-function of tensors and Mi + 1 is the last mode
              (i)
of tensor S         . This update is carried out iteratively until the the sum in eq. (4)
converges.
    The batched version of the               MetaFac approximation can be derived by using
the KL-divergence resulting in an appropriate approximation scheme, proposed
by the following update function:

                                             n
                                             X
                        z = (1 − α)                 acc(S (i) , Mi + 1) + αz t−1                                      (10)
                                             i=1
                                                                                                  (q)
                                              X
                     U (q) = (1 − α)                    acc(S (j) , q, Mi + 1) + αU t−1                               (11)
                                             l:el ∼vq



4    Stream-based Community Discovery with Tensors

In this section we present our adaptions of the                            MetaFac framework by intro-
ducing a sampling-based tensor representation of graphs and using time-stamped
relations to induce a decrease of impact of relations to reect the decreasing im-
portance of Twitter messages.
   Given a social network we are provided with a sequence M of messages
M := hm0 , m1 , . . .i where each message mi implies a set of relations R(mi ).
Let τ (mi ) ≥ 0 be the arrival time of mi . This results in an overall sequence of
relations S := hR(m0 ), R(m1 ), . . .i which are continuously added to the evolving
social network graph G. Hence we are faced with a sequence hGt0 , Gt1 , . . .i of
graphs where each Gti contains the relations of all messages up to time ti .
             0                                          0
      Let t, t be points in time with t < t . In the following we will by G[t,t0 ] denote
                                                                                 0
the graph implied by only the messages of time-span [t, t ], hence Gt = G[0,t] .
Accordingly the graphs are represented by the corresponding tensor as G[t,t ] 7→                      0
n                      o
 X (1) , . . . , X (n)   .
                          [t,t0 ]


4.1     The      MFstream Algorithm
The    MetaFac approach uses a sliding window of some xed window size ws to
manage streams. Given a sequence of time points tj for j ∈ N with tj = tj−1 +ws ,
                        (i)
it factorizes {X              }[tj−1 ,tj ] based on a trade-o factor α as denoted in equation
(10) and (11).
      Our   MFstream algorithm interleaves the optimization of MetaFac by
adding new relations during optimization and uses a time-based weighting func-
tion to take into account the relations' decreasing importance. Additionally, the
optimization is carried out over only a partial set of relations as older relations
tend to become obsolete for adjusting the model. We will present the time-based
weighting and the sampling strategy in the following and present the complete
algorithm in 4.4.



4.2     Time-based Relation Weighting

So far we considered the property of two or more entities to be related as
binary property, i.e. if entities i, j and k are related, then X i,j,k
                                                                 = w, with
w ∈ {0, 1}. With the extraction of relations from time-stamped messages  as
provided within the Twitter platform  we are interested in incorporating the
age of these relations to reect the decreasing up-to-dateness of the information.
      Hence we associate each relation r ∈ Ri with a timestamp τ (r) of the time
at which this relation has been created (i.e. the time of the message from which
it has been extracted). With S being a set of relations extracted from messages
this leaves us with the tensor representation of relation type Ri = Vi1 ×. . .×Vil(i)
as

                                         τ (r) if r = (ϕ−1                −1
                                     
                   (i)                                  1 (ν1 ), . . . , ϕl(i) (νl(i) )) ∈ S
                 X i1 ,...,il(i) =                                                                    (12)
                                           0 otherwise
      In addition to that, we introduce a global clock, denoted by                             τmax , which
represents the largest (i.e. the most recent) timestamp of all relations observed
so far, i.e. τmax        := max { τ (r) | r ∈ X }. Storing the timestamp τ (r) for each
entry r in the tensors allows us to dene a weighting function for the relations
based on the global clock value. A simple example for a parametrized weighting
function is given as

                                                              α
                                     ωα,β (r) :=                          .                           (13)
                                                    α + β1 (τmax − τ (r))
4.3     Sampling

The runtime of each iteration of the approximation scheme is basically mani-
fested by the maximum number N of non-zero entries in the tensors. To reduce
the overall optimization time, we restrict the size of the tensors, i.e. number of
entities of each type, by introducing constants Cq ∈ N and providing new entity
mappings ϕq by


                          ϕq : V q → V q   with   V q = {1, . . . , Cq }.

This has two implications: Clearly, these ϕq will not be bijective anymore if

|Vq | > Cq . Moreover, the size of the U (q) matrices will also be limited to Cq × k .
      We deal with these imposed restrictions by dening dynamic entity mappings
ϕq , which maps a new entity e (i.e. an entity that has not been mapped before)
to the next free integer of {1, . . . , Cq }. If no such element exists, we choose
f ∈ {1, . . . , Cq } as the element that has longest been inactive, i.e. not been
mapped to by ϕq . The relations aected by f are then removed from all tensors
                                         (q)    1
and the current cluster model, i.e. U f,i =
                                                k ∀ i = 1, . . . , k .
                                                                 (i)
      Eectively this introduces a current window {X                 } of relations, that aect
the adaption of the clustering in the next iteration. In contrast to the original
MetaFac approach this also frees us from having to know the number of entities
and a mapping of the entities beforehand.



4.4     Continuous Integration

With the prerequisites of section 4.2 and 4.3 we now present our stream adap-
   MFstream as Algorithm 1. MFstream is a purely dynamic approach of
tion
MetaFac which adds new relations to the data tensors {X (q) } and ts the
                (q)
model [z], {U         } after a specied number of T messages. This dierentiates our
approach from     MetaFac as the optimization is performed by running a single
iteration of the optimization loop  with respect to the time-based weighting
 after adding the relations of T messages to the tensors. The time complex-
ity per iteration of the     MFstream algorithm is the same as for the MetaFac
algorithms (see Section 3). Due to the xed tensor dimensions, the maximum
number of non-zero elements N is constant, which implies O(1) runtime.



5      Evaluation

For the evaluation of our approach we extracted relations of the Twitter website.
Twitter is a blogging platform giving users the opportunity to inform other users
by very small snippets of text containing a maximum of 140 characters. In spite
of such limitations users are not only posting messages  called tweets  but also
enriching their tweets by tags, urls or mentions, which allows users to address
other users. This brings up the entity types {user, tweet, tag, url }.
      To discover clusters on the above mentioned entities present on the Twitter
platform, we constructed a metagraph for Twitter. The entity types themselves
Algorithm 1 The               MFstream algorithm.
1: Input: MetaGraph G = (V, E), Stream M = hmi i, capacities Cq , number of clusters k, constant
      T ∈N
2: procedure      MFstream
3:    Initialize z , {U (q) }, c := 0
4:    while M 6= ∅ do
5:        m := mc , c := c + 1                                          . Pick the next message from the stream
6:        for all (rj1 , . . . , rjl(j) ) ∈ R(m)         do
7:              for all p = 1, . . . , l(j) do
8:                 if ϕp (rjp ) = nil then                                               . Replacement needed?
 9:                    if |ϕp | = Cq + 1 then
10:                           f ∗ := arg minf ∈ϕp τ (f )
                                (p)       1
11:                           U f ∗ ,s := k ∀ s = 1, . . . , k
12:                      else
13:                           f ∗ :=       min          ϕ−1
                                                         p (f ) = nil                  . Pick next unmapped f ∗
                                       f ∈{1,...,Cp }
14:                      end if
15:                      ϕp (rjp ) := f ∗ , τ (f ∗ ) := τ (m)
16:                end if
17:             end for
18:          end for
19:          νi := ϕp (rji ) for i = 1, . . . , l(j)
               (p)
20:          Xν            := τ (m)                                              . Update corresponding tensor
                 1 ,...,ν
                       l(j)
21:          if c ≡ 0 mod T then                                            . Single opt.-iteration every T steps
22:             for all i ∈ {1, . . . , n} do
23:                  compute {S (i) } by eq. (7) and (6)
24:                  update z by eq. (8)
25:              end for
26:              for all j ∈ {1, . . . , q} do
27:              update {U (j) } by eq. (9)
28:          end for
29:       end if
30:    end while
31: end procedure




                                             4
imply as much as P(V ) = 2                       possible relation types, some of which will not
arise or are redundant. E.g. since each tweet is written by a user, there is no
relation (tweet,tag) which does not also refer to a user. Hence, our MetaGraph is
based on the relation types {R1 , . . . , R8 } given in Fig. 2. We extracted 1000 seed



   R1 : a user writing a tweet.                                   R6 : a user writing a tweet containing an
   R2 : a user writing a tweet containing a spe-      url and mentioning another user.
      cial tag.                                       R7 : a user writes a tweet containing a tag
     R3 : a user writing a tweet containing a spe-    and a mentioned user.
      cial url.                                       R8 : a user mentioning another user in a
     R4 : a user mentioning another user in a         tweet containing a tag and an url.
      written tweet.
     R5 : a user writes a tweet containing a tag
      and an url.       Fig. 2. Relation types for the              Twitter
                                                                 metagraph




users and their direct friends and followers. Followers are following a user which
means that messages of the user are directly visible for the followers at their
twitter website. Friends are all the users a particular user is following. We used
an English stopword lter to extract users which are writing in English language
and processed all friends and followers of the seed users, revealing about 478.000
users. For these, we extracted all the messages written between the 19th and 23rd
of February 2010. Out of these 2.274.000 tweets we used the tweets written at the
19th of February for our experiments, leaving about 389.000 tweets from 41.000
users.



5.1      Evaluating the Model


For a comparison of the clusterings produced by          MFstream and the MetaFac
approaches we employ the within cluster point scatter [8]. This is given as


                                   K
                                   X           X
                        W (C) :=         Nk            k xi − xk k2              (14)
                                   k=1        C(i)=k


where K is the number of clusters, xi is a member of a cluster C(i) and xk is
the centroid of a cluster k . It can be seen as a sum of dissimilarities between
elements in the particular clusters.
      We created clusterings (always using k = 10) on a stream of 200k messages
with MFstream and restricted the tensor dimensions to Cuser = Ctweet =
5000 and Ctag = Curl = 1000. We employed several weighting functions such
as ω1,1 , ω10,1000 and ω100,1000 as well as a binary weighting which equals the
unweighted model (i.e. w ∈ {0, 1}).
      To be able to compare the clusterings of          MFstream and MetaFac, we
processed messages until the rst entity type Vi reached its limit and stored the
resulting clustering on disk. Then we reset the ϕ mappings and started anew, re-
vealing a new clustering every time an entity type i reached Ci , revealing a total
of 93 clusterings. We applied   MetaFac on the messages that have been used to
create these 93 clusterings and computed their similarities using W (C). Figure
5 shows that    MFstream delivers results comparable to MetaFac for dier-
ent weighting functions. Figure 5 shows that using timestamped values instead
of binary values for calculation of the       MFstream delivers better results. The
decrease of T , which implies a larger number of optimization steps, intuitively
increases the quality of   MFstream as is attested by Figures 4 and 6.
      In addition, we made experiments to show the eect of the update frequency
T on the runtime. Figure 7 shows the relative runtime of MFstream where
T = 1 corresponds to the baseline at 1.0. Raising T results in shorter runtime,
since the model is updated less frequently, which is the major time factor. The
upper curve shows the runtime for updating after 5 relations (T = 5), the middle
one shows T = 10, and the latter refers to T = 50.
      Varying sizes of entity types by the       Cq results in clusterings of dierent
numbers of entities, which cannot be directly compared by W (C). Hence, we
normalized W (C) by the variance V of each clustering. Larger models of course
incorporate more information, which results in more stable clusterings as can be
seen in Figure 8.
Weighting    W (C) (mean) std. deviation        T   W (C) (mean) std. deviation
MetaFac       5.685 · 107   1.32 · 107          5    6.043 · 107   1.35 · 107
binary        7.511 · 107   2.00 · 107          10   6.133 · 107
                                                                   1.36 · 107
ω1,1          6.142 · 107   1.57 · 107          50   6.058 · 107   1.40 · 107
ω1,1000       6.002 · 107   1.38 · 107          100  6.465 · 107   1.49 · 107
ω10,1000      6.272 · 107   1.43 · 107          250  10.882 · 107
                                                                   3.13 · 107
                        7
ω100,1000     6.724 · 10    1.55 · 107          500  43.419 · 107
                                                                   11.47 · 107

Fig. 3. Mean  W (C) of dierent weights     Fig. 4. Mean of W (C) with dierent up-
(T   = 20), comparing MFstream and          date steppings T (weight used: ω1,1000 )
MetaFac




Fig. 5. W (C) for MFstream compared         Fig. 6. Relative W (C) of   MFstream us-
to the   MetaFac clusterings.               ing dierent update step sizes T




Fig. 7. Rel. runtime of   MFstream using    Fig. 8. W (C)/V of   MFstream using dif-
dierent numbers of relations for update.   ferent sizes of models.




6    Conclusion and Future Work

In this work we presented    MFstream, a exible algorithm for clustering multi-
relational data from evolving networks, derived from the  MetaFac framework
by [12]. The main improvement of our approach is the reduction of the approx-
imation scheme on to a small relevant window of relations. The proposed time-
based weighting of relations contributes to this reduction by removing obsolete
information that is not relevant to the model adaption anymore.         MFstream is
able to handle relations containing new, unseen entities by oering a replacement
strategy for the set of entities considered at optimization time. This makes it
especially suitable to continuously integrate new data from a stream. We eval-
uated   MFstream on real-world data crawled from the Twitter platform and
                          MetaFac.
showed its comparability to
   The use of backend storage for o-loading obsolete data that can be re-
imported into the optimization window at a later stage might be an interesting
advancement. Also, concurrent criteria runtime and quality oer a starting point
for multi-objective optimization. Additionally, recent works [11] motivate further
improvements to handle a dynamic number k of clusters within            MFstream.

References

 1. E. Acar, S. A. Çamtepe, M. S. Krishnamoorthy, and B. Yener.            Modeling and
    multiway analysis of chatroom tensors. In    ISI, pages 256268, 2005.
 2. E. Acar, S. A. Çamtepe, and B. Yener. Collective sampling and analysis of high
    order tensors for chatroom communications. In     ISI, pages 213224, 2006.
 3. J. F. Allen. Towards a general theory of action and time.     Articial Intelligence,
    23:123154, 1984.
 4. B. Bader, M. W. Berry, and M. Browne.       Survey of Text Mining II, chapter Dis-
    cussion tracking in Enron email using PARAFAC, pages 147163. Springer, 2007.
 5. A. Banerjee, S. Basu, and S. Merugu. Multi-way clustering on relation graphs. In
    SDM. SIAM, 2007.
 6. D. Cai, X. He, and J. Han. Tensor space model for document analysis. In E. N.
    Efthimiadis, S. T. Dumais, D. Hawking, and K. Järvelin, editors,         SIGIR, pages
    625626. ACM, 2006.
 7. R. Harshman. Foundations of the parafac procedure: Models and conditions for an
    explanatory multi-modal factor analysis.    UCLA Working Papers in Phonetics,
    16, 1970.
 8. T. Hastie, R. Tibshirani, and F. J. The elements of statistical learning-data mining,
    inference and prediction.   Springer, Berlin Heidelberg New York, 2001.
 9. T. G. Kolda, B. W. Bader, and J. P. Kenny. Higher-order web link analysis using
                       In ICDM '05: Proceedings of the Fifth IEEE International
    multilinear algebra.
    Conference on Data Mining, pages 242249, 2005.
10. L. D. Lathauwer, B. D. Moor, and J. Vandewalle.        A multilinear singular value
    decomposition.     SIAM J. Matrix Anal. Appl., 21(4):12531278, 2000.
11. Y.-R. Lin, J. Sun, N. Cao, and S. Liu. Contextour: Contextual contour visual anal-
                                              Proceedings of the SIAM Conference
    ysis on dynamic multi-relational clustering. In
    on Data Mining (SDM10), 2010. Not published, yet.
12. Y.-R. Lin, J. Sun, P. Castro, R. Konuru, H. Sundaram, and A. Kelliher. Metafac:
    Community discovery via relational hypergraph factorization.      In   Proceedings of
    the SIGKDD 2009, pages 527536, Paris, France, 2009. ACM.
13. A. Shashua and T. Hazan. Non-negative tensor factorization with applications to
    statistics and computer vision. In ICML '05: Proceedings of the 22nd international
    conference on Machine learning, pages 792799, New York, NY, USA, 2005. ACM.
14. J. Sun, D. Tao, and C. Faloutsos.     Beyond streams and graphs: dynamic tensor
    analysis.   In   Proceedings of the SIGKDD 2006, pages 374383, New York, NY,
    USA, 2006. ACM.
15. X. Wang, J.-T. Sun, Z. Chen, and C. Zhai. Latent semantic analysis for multiple-
    type interrelated data objects. In   Proceedings of the SIGIR 2006, pages 236243,
    New York, NY, USA, 2006. ACM.