A Human Communication Network Model Oksana Pichugina1, Babak Farzad2 1 Kharkiv National University of Radio Electronics, Kharkiv, Ukraine pichugina os@mail.ru 2 Brock University, St. Catharines, Canada bfarzad@brocku.ca Abstract. A number of attributed formation models based on Erdos- Renyi and Barabasi-Albert random graph models are presented. One of them is a Human Communication Network (HCN) model based on time restrictions on face-to-face communication. Construction of this weighted network requires a few numerical parameters and allows to transform any unweighted node attributed network into weighted. This transformation helps solving numerous problems in Network Analysis such as commu- nity detection, network topology inference, etc. Understanding nature of human communication networks allows to solve many practical problems starting with fast spreading any information and innovation through the networks and ending with detecting key people, collaboration with whom helps achieving different goals. Keywords. Social Networks, Community Detection, Attributed Networks, Node Partition, Random Graphs Key Terms. Network Decoration, Community Detection, Random Graphs 1 Introduction Network Analysis is an area of research that has been studied intensively lately. Researches investigate structural characteristics of different networks, network formation models, and many other related questions. Among a variety of net- works, social networks, which reflect a diversity of people relationships, are a priority [5], [7]. Study of social networks is important since it helps understand- ing how our world is organized, what place each of us takes in it, how this situation affects us and how the knowledge can be used to achieve our goals. Social networks are characterized by heterogeneity of nodes and edges, spar- sity, high average clustering coefficient, small average shortest path length and power-law degree distribution, existing observable and tightly bound groups of elements called communities [5]. Most of these properties are united in ”small- world networks” and ”scale-free networks” concepts [1],[4]. Many attempts have been made to construct social networks, but still no satisfactory solution to simulate all the listed properties is found [1],[4],[5]. It is also important to find efficient ways of these community detection (CD) [6],[8]. We believe that the key ICTERI 2016, Kyiv, Ukraine, June 21-24, 2016 Copyright © 2016 by the paper authors - 34 - in qualitative CD in social networks is in using the heterogeneity (making them multi-layer ones) and study the issue of such networks formation. 2 Definitions and Notations Definition 1. [6] A social network is a hybrid graph, which is represented in the form: G = (V, E, Λ, Λ0 ), (1) where V is the set of nodes (the social network’s users), E is the set of edges (these users relationships), Λ and Λ0 contain an information about attributes related to each node v ∈ V and each edge {u, v} ∈ E, respectively. The network represented in the form (1) is an attributed network if Λ ∪ Λ0 = 6 ∅. So, any attributed network representing individuals’ relationships is social. 0 Let Λ ∈ IRn×K , Λ0 ∈ IRm×K hence V, E are of size |V | = n, |E| = m. Initially, we consider an unweighted node-attributed network G = {V, E, Λ}, K ≥ 1. Then we assign weights to its edges (decorate the edges) and come to consideration of a weighted network Gw = {V, E, Λ, Λ0 } with Λ0 being a matrix-column of edge weights (K 0 = 1). Gw is a node-edge-attributed and used then for CD. Introduce some notations: G[.] = {V, E [.] , Λ} - is an unweighted node-attributed [.] network with an adjacency matrix B [.] = (bij ) ∈ IRn×n . After decoration E [.] by weights, the new weighted network is denoted by Gw[.] = {V, E [.] , Λ, Λ0 } and its [.] weighted adjacency matrix (WAM) - by A[.] = (aij ) ∈ IRn×n . [.] The node degree di of a node vi ∈ V in G[.] is the number of its incident edges: [.] [.] [.] [.] di = |Ni | where Ni = {u ∈ V : u ↔ vi , {u, vi } ∈ E [.] }. The node strength si [.] of vi ∈ V is a sum of weights of its incident edges in G . In terms of adjacency [.] P [.] [.] P [.] and weighted adjacency matrices, these values are: di = j bij , si = j aij . SL A network cover is a division of the network nodes C = {Cl } satisfying l=1 Cl = V . If in the division the node clusters Cl , l ∈ JL = {1, ..., L}, are pairwise disjoint, then it is called a network partition. Assume that the nodes are decorated by K discrete attributes {AT k }, Λ = (atki ) where atki is the value of AT k for a node vi , and there are Lk different values of AT k . Let AClk ∈ V be a set of nodes with l-th value of AT k . We call it a node attribute cluster (AC) and denote a G-partition into ACs related to different values of AT k by AC k = {AClk }l∈JLk (nkl = |AClk |). Let G[.]k be a G[.] - subnetwork related to AT k . A sum of unweighted S networks {Gk }k of the same k node set V is an unweighted network G = {V, k E , Λ}. A linear combination of weighted S k networks {Gwk }k of a node P setk V is a weighted network G = w 0 {V, k E , Λ, Λ } with a WAM A = k αk A where {αk } ⊂ IR are coefficients [.] [.] of this linear P combination. Let ω(G ) be denoted a weight of a network G [.] (ω(G ) = ij aij ). A network of a weight one is a normalized network. The networks’ linear combination is a weighted network sum if X X Gw = W k · Gwk where {W k } > 0, W k = 1. (2) k k - 35 - 3 Motivation Let us consider a social network. Suppose that, in addition to basic information about the node and edge sets, there is available some extra information about the nodes and edges features (social semantic networks are highly helpful here [2]). These additional characteristics are called attributes and the procedure of their complementing is decoration of the network [3], [5] resulted in creation of an attributed network [8]. Applying CD on the network we, typically, get communities closely related to one node attribute and this dominant attribute does not allow us to observe communities in other layers related to the rest, less important, node attributes. For instance, in the humankind network the domi- nant attribute would be belongingness to families. If we are interested in study communities, say, in work place, then the family division is an obstacle on this way. However, if it is possible to transform the network into weighted, moreover, to assign edge weights to each layer subnetworks of the multi-layer network, the problem of multi-layer CD (MLCD) can be solved. For that we just detect the dominant attribute and extract the corresponding subnetwork from considera- tion repeating then the procedure on the remaining network. The crucial part of the approach is constructing edge sets of the one-layer sub- networks and distributing weights within them. The first one is a problem of the attributed network formation considered in Sect.4.1 (the edge inference problem [5]), the second one is the edge attribute inference problem [5]. The last one we solve for a social network of people face-to-face communication in Sec.4.2. 4 Human Communication Network Models At this section we touch formation of attributed networks. We are wondering how an attributed network (1) is formed if the information about nodes V and their attributes Λ is known. In other words, we review formation of an edge set E and its attributes Λ0 and refer to them as Problems 1 and 2, respectively. 4.1 Attributed Network Formation We consider a number of ways to solve Problem 1. For convenience, we interpret the presented network formation models in terms of communication of people spending a time together during common activities/interests (AIs). Here nodes are people and their AIs are the nodes’ attributes. Model 1 - an association network model. An association network Ga [5] is an example of an attributed network where links exist between any nodes with common attributes. It can be interpreted as a network of virtual contacts of people with common interests where supporting such contacts does not need anything. The auxiliary network Gwk corresponds to each activity/interest (AI) AT k ; Gw is representable as a weighted network sum (2) of K networks, which are collec- tions of complete graphs: Gwk = ∪ Knkl . Thus the network Ga is a cover of l∈JLk - 36 - K overlapping V -partitions by a disjoint union of complete graphs. Model 2 - an attributed networks model based on Erdos-Renyi Model. Suppose that for existing an edge a similarity of node attributes is necessary, but not enough because of randomness. Similar to Model 1, we represent the network Gw by (2). Edges in Gwk are created randomly with probability pkl between two nodes vi , vj sharing the l-th value of the attribute AT k . Hence Gk is a node par- tition by Erdos-Renyi Random Graphs (ERRGs) [4]: Gwk = ∪ ERRG(pkl , nkl ) l∈JLk and the resulting network Gw is an overlapping of K partitions by ERRGs. In terms of human communication, Model 2 simulates a real situation where a group of people is formed simultaneously. Contacts of each user occur randomly without analysing any prior information due to its inaccessibility. The commu- nication can be established on a regular basis only if these people actually have common interests. Different type of contacts are formed independently. Model 3 - an attributed networks model based on Barabasi-Albert Model. In comparison with Model 2, here we review a situation where a group of people is formed gradually. First of all, group members aspire to contacts with popular and authoritative colleagues in each area of expertise. First, these contacts are formed for the most important AIs, then for the less significant. A chance to clarify common interests is higher if the contact already exists. As before, Gw is a weighted network sum (2). {Gwk } are formed consecutively by k in accordance with decreasing priorities of node attributes. For each k an edge set E k is formed between nodes with the same value of AT k consecutively by i with probabilities depending on degrees of all preceding nodes {dki0 }i0