<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Pro ject Teams Creation Based on Communities Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mikhail Semenov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lev Bulygin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Koroleva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dilmurat Tursunov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Tomsk Polytechnic University</institution>
          ,
          <addr-line>Tomsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ural State Pedagogical University</institution>
          ,
          <addr-line>Yekaterinburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The purpose of this study is to detect project teams in a group. A key point in considering group's relationships is the reciprocal influence, whereby group's members influence each other. There was conducted a survey based on reciprocal nomination method, and then a social network was constructed. Participants were first-year bachelor students of Tomsk Polytechnic University. Various social network analysis algorithms were used to cluster network in communities. The results of analysis were discussed with the teachers and students, and then detected community teams were adjusted within the key actors of group. The results of the study may be used to create project teams, which can make successful collective actions in educational projects.</p>
      </abstract>
      <kwd-group>
        <kwd>community detection</kwd>
        <kwd>key actors</kwd>
        <kwd>project team</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        A problem arising very frequently is how to identify new teams in an existing
group. Community detection helps to understand the distribution of key social
actors and their interrelations in the network [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. At present, many community
detection algorithms have been designed [
        <xref ref-type="bibr" rid="ref1 ref13 ref15 ref19 ref3 ref8">1,3,8,13,15,19</xref>
        ]. Much research has been
conducted on social network analysis (SNA) using graph theory [
        <xref ref-type="bibr" rid="ref10 ref11 ref14 ref16">10, 11, 14, 16</xref>
        ].
One of the important results is the identification of sociometric features that
characterize a network. However, it is still not clear, which algorithms are reliable
and should be used in applications, because network is a single object, which
cannot be simply splitted into a training and a test dataset [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>
        Firstly, we need to define what is meant by a project team. A project team
is a group of people who are able to act in concert and to achieve collectively
the common goal. Team members have specific and unique roles, where the
performance of each role contributes to achievement of the team’s goal. In project
teams members care about the success of other team’s members because their
own goal attainment is often inextricably bound to collective achievement [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
      <p>This study is a part of one-year follow-up research of social network changes
among the members of students group. The purpose of this study is to detect
project teams in a group. We present an approach for project teams creation
based on SNA methodology. This approach includes descriptive analysis,
community structure identification and key actor analysis using graph theory.</p>
      <p>The structure of the paper is following. In Section 2, we give short overview
of related works. Then, in Section 3, we introduce the dataset, on which our
approach has been tested. We calculate the measures of our networks, detect
key social actors. In Section 4, we examined an influence of teams’ interactions
over time on the academic performance. A brief summary follows in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Works</title>
      <p>
        Pijl et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] compared two methods for the assessment of students’ friendship
networks: the reciprocal nomination method and social cognitive mapping. In
total, 190 participants took part in the experiment. The authors introduced
types of isolated students in their study: a) a student with no reciprocated links
at all (type 1), and b) a student with one reciprocated link (type 2). A cohesive
subgroup defined as a group of at least three students a) who have more internal
links than external links, b) are connected by some path to each of the group
members and remain connected when up to 10% of the group is removed.
      </p>
      <p>
        Rienties et al. [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] conducted a set of experiments in order to understand
how students develop and maintain learning and friendship relations over time
in a large classroom setting (200+ students). Students were put in 41 teams of 5
students on average. The results indicate that the instructional design might have
a strong influence on how students work together in teams, how social learning
and friendship interactions develop, and finally increase academic performance.
      </p>
      <p>
        Pronin et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] suggested grouping method to reorganize student groups
using the SNA methodology. The problem was in reorganizing four existing groups
of students into three new groups. The Girvan-Newman algorithm was used in
order to create three new groups, and then authors adjusted new groups based
on existed relations between students and the modularity score. This method
may be used to create project teams for research classes or scientific labs.
      </p>
      <p>
        Liu et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] proposed the algorithm to measure the importance of the
actors in network. This algorithm based on in-weighted degree and out-weighted
degree of vertex and on considering the information of the directed edges. In
contrast, D. Conway [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] introduced the method using the comparison of centrality’s
relative values such as eigenvector centrality and betweenness.
      </p>
      <p>
        Lomi et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] specified a model that permit estimation of the
interdependent contribution of social selection and social influence to individual
performance. The proposed stochastic model is based on the direct observation of
connectedness between students. In their study authors focused on the effects of
75 participants on individual performance at the classroom level.
      </p>
      <p>
        Ertem et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] used SNA metrics in order to predict learning performance in
terms of student’s position in the network. Authors found a positive correlation
between students’ performance and six employed metrics: degree, eigenvector
centrality, betweenness centrality, hub, authority and PageRank.
      </p>
      <p>
        In contrast with modularity that was proposed by M. Newman [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], there
was an accepted standard for the results of community detection. Yang et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
found that a conductance and a triangle participation ratio could provide the
best performance in characterizing the communities detection quality.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Experimental Evaluation</title>
      <p>Research questions. Our one-year follow-up research is aimed to explore the
community organization of the network and the influence of the structural factors
of the network on academic performance of students. We formulated the following
questions concerning the community organization of the network:
Q1. What is the network structure at the classroom level?</p>
      <p>Q2. How to use the network structure to increase academic performance of
students?</p>
      <p>Dataset. In our experiments, a local social network was built on reciprocal
nomination between 20 first-year bachelor students of Tomsk Polytechnic
University during the fall semester of 2015. We used the direct-preference questions.
The 20 students answered the four social network questions:
1. Name classmates with whom you spend free time.
2. Name classmates to whom you are applying for the information, related to
academic activities.
3. Name classmates who could influence your academic performance.
4. Name classmates with whom you don’t want to cooperate in framework of
creative project.</p>
      <p>The students were allowed to nominate up to four classmates. Participants
were 20 first-year bachelor students who were 17 to 19 years old (M = 18.5, SD =
0.35; 75% male). Data were collected on-line with the Google forms.</p>
      <p>Data Preprocessing. Four square matrices A1, A2, A3, and A4 of size 20
were generated respectively on the basis of the questionnaire. In each adjacency
matrix A1, A2, A3 the element (i, j) is equal to 1 if row student i nominated the
column student j, otherwise the element (i, j) = 0. In matrix A4 the element
(i, j) is equal to −1 if row student i nominated the column student j, otherwise
the element (i, j) = 0. Then each matrix A1, A2, A3 was summarized with
the matrix A4, and the binarization procedure was applied to the result: if the
element (i, j) is less or equal than 0 then it gets set to 0, otherwise it is set to 1.</p>
      <p>The social networks are represented by directed graphs Gk = (Vk, Ek), k =
1, 2, 3, where a set of vertices Vk includes n = 20 members of group, and a set
of edges Ek ⊆ Vk × Vk presents the relation «reciprocal nomination» that
corresponds to k = 1, 2, 3 questions. A key point in considering these relationships
is the reciprocal influence, whereby team’s members influence each other. The
graph Gk is directed, i.e. every edge (i, j) ∈ Ek links the source vertex i and
the target vertex j. The direction of the edges is makes each adjacency matrix
A1, A2, A3 of the each directed graph G1, G2, G3 non-symmetric because the
source vertex defines nomination to the target vertex but not vice versa. The
number of edges mk = |Ek| ≤ n · (n − 1), k = 1, 2, 3.</p>
      <p>Descriptive Network Statistics. To address research question Q1 we
calculated descriptive network statistics in order to define network structure at the
classroom level. Figure 1 illustrates the original networks G1, G2, G3, reflecting
the structure of reciprocal nominations in the group. These networks G1, G2, G3
have the identical vertex set V , |V | = n = 20, but different sets of edges E1,
E2, E3: |E1| = m1 = 81, |E2| = m2 = 72, |E3| = m3 = 71 respectively, in
which one vertex represents an actor, and one edge denotes the nomination
between any two actors. To each actor in the student’s group labels were assigned:
A01, A02, . . . , A20. In our experiment, loops and multi-edges are not allowed.</p>
      <p>It’s seen that the diameter of graph G2 equal to 6 because the longest path
between actors A15, A06, A12, A19, A11, A08 and A10 takes 6 edges, the
diameter of graphs G1 and G3 equal to 5 (Figure 1, green arrows), while the lengthes
of average path between any vertex pairs are {2.166, 2.236, 2.190} respectively.
In our experiment, density range from 18%(0.186) to 21%(0.213) only, it can be
explained by the limitation of the questionnaire, which recommended to
nominate up to four actors. The transitivity range from 0.399 to 0.409 indicates the
low level of intra-group interaction, the reciprocity range from 0.11 to 0.66.</p>
      <p>A basic property of the vertices in a graph is their degree. Degree provides
information on the position of actors and how they communicate. The in-degree
din(v) (out-degree dout(v)) of vertex v is equal to the number of incoming
(outgoing) edges. The in-, out-, total degree distribution of vertices of the graphs
G1, G2, G3 are shown in Figure 2. As we can see from Figure 2, the modal
interval equal to 4, which has frequency more than any other interval.
where A is the adjacency matrix. In our case, clustering coefficient cl(v) is from
0.248 to 0.377.</p>
      <p>To evaluate the significance of the network statistics (Table 1) the
simulation of the random graphs was used. Based on topological properties of the each
graph G1, G3, G3 1000 random networks were generated to compute their
average network statistics. Table 1 gives these statistics in the last column hG2i
corresponding to the network G2. The network G2 has approximately the same
average path length than 1000 random graphs of the same size (2.236 and 2.262
respectively), and the network G2 has a clustering coefficient that is higher than
the corresponding value of 1000 random graphs (0.3767 and 0.258 respectively).</p>
      <p>
        Community Detection Algorithms. At present, many community
detection algorithms have been designed [
        <xref ref-type="bibr" rid="ref1 ref13 ref15 ref19 ref3">1, 3, 13, 15, 19</xref>
        ]. In our experiments, we
chose three community detection algorithms: edge betweenness algorithm,
walktrap algorithm and optimal community algorithm. We used R realization of the
algorithms and igraph software package [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Denote by C = {C1, C2, . . . , Cp} a partition of a set of vertices V . We call
C a clustering of a graph G and the Ci, which is required to be nonempty,
community, i = 1, 2, . . . , p. Following to the paper [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] E(C) = Sp
i=1 E(Ci) is the
set of intracluster edges, m(C) = |E(C)|, and E \ E(C) is the set of intercluster
edges, m¯(C) = |E \ E(C)|.
      </p>
      <p>
        Let us list major features of these algorithms. The edge betweenness
algorithm [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is based on calculating the betweenness (number of shortest paths
between any two vertices which pass through this edge) of all edges in the graph
and removing the edge with the largest betweenness score. This process is
repeating on the resulting graph until no edges remain. A partition C of a set of
vertices V can be computed in O(nm2) time. The walktrap algorithm is based
on random walk process on a graph [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. A walker is on a vertex and moves
to a random vertex each time step. After a few steps (3–5) the walker is more
likely to stay within the same community because there are only a few external
edges. The walktrap algorithm uses the results of this random walk process to
merge separate vertices in communities that minimizes distance from other
vertices in the community. Time complexity of the walktrap algorithm is O(mn2)
in the worst case. The optimal community algorithm [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] detects the
community structure for a graph, by maximizing the modularity score over all possible
partitions. The algorithm starts with the singleton community clustering and
iteratively merges those two communities that yield a clustering with the best
modularity. Time complexity of optimal community algorithm is exponential in
the number of vertices O(2n).
      </p>
      <p>Using algorithms mentioned above each graph G1, G2, and G3 was divided
into communities. Figure 3 shows results of network partitioning G2 into clusters,
which are denoted by different colors. Five communities were detected with edge
betweenness algorithm, while using walktrap and optimal algorithm partitioned
the graph G2 into 4 communities. The numbers of actors in these communities are
different: V (Cb) = {14, 1, 1, 3, 1}, V (Cw) = {3, 8, 5, 4} and V (Cop) = {7, 5, 5, 3}.
We use subscripts {b, w, op} to denote the clustering algorithm that was used.
Figure 4 shows distribution of number communities detected with the edge
betweenness algorithm, walktrap algorithm and optimal algorithm for 1000 random
graphs of the same size as the network G2. According to Figure 4, it is clear that
the actual number of communities detected in the original network G2 (4
communities for walktrap and optimal algorithms) would be considered typical from
the perspective of random graphs, while using the edge betweenness algorithm
partitioned 1000 random graphs into number communities from 1 to 14.
where m(C) = |{(u, v) ∈ E : u ∈ C, v ∈ C}|, E(m(C)) is the expected value of
m(C) under some model of random edge assignment.</p>
      <p>
        The edge betweenness and walktrap algorithms are the hierarchical clustering
algorithms, and the modularity score of the current clustering is stored after each
time step. In the optimal community algorithm the highest modularity is defined
after (n − 1) merges. The algorithms above use the modularity score to decide
where to stop the splitting or merging. Another way to qualify the
communities detection is to compute scoring function based on the internal connectivity,
external connectivity and combination of internal and external connectivity of
vertices set [
        <xref ref-type="bibr" rid="ref20 ref8">8, 20</xref>
        ]. To each network partitioning Cb, Cw, Cop from Figure 3 the
number of intracluster edges m(C) are given by diagonal elements and the
number of intercluster edges m¯(C) between communities Ci and Cj are given by
(i, j) elements, i 6= j, i, j ∈ {1, 2, . . . , p} for various algorithms are represented
by matrices:
 39 3 4 13 2   4 5 1 0   19 9 5 5 
 3 0 0 3 1 
E(Cb) :  4 0 0 2 0 , E(Cw) :  15 2111 1151 45 , E(Cop) :  95 155 75 12 .
      </p>
      <p> 123 13 02 41 01  0 5 4 6  5 2 1 4</p>
      <p>
        In the each matrix row and column sums belong to the number of edges
incident on a given community. In the matrixes we bold the number of intercluster
edges. Using data from matrixes we calculated the conductance [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]:
con(C) = m¯(C)/(2 · m(C) + m¯ (C)),
where m¯(C) the number of edges on the boundary of community C, m¯(C) =
|{(u, v) ∈ E : u ∈ C, v ∈¯C}|. The conductance has a value between 0 (best score)
and 1 (worst score). Table 2 gives the value of scoring function: modularity,
mod (C) and conductance, con(C) for various partitions Cb, Cw, Cop and networks
G1, G2, and G3.
      </p>
      <p>
        Community Structures Comparison. After getting the communities, the
partitions were compared using various metrics, and results are presented in
Table 3. Normalized mutual information (NMI) measure is based on the fact
that if two partitions are similar to each other, then only a small amount of
additional information is needed to infer one clustering assignment from the
other [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The NMI measure, the Rand index (RI) have a value between 0 and 1,
when the two partitions agree perfectly, these measure are 1 [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. The adjusted
Rand index (ARI) can yield negative values, and adjusted Rand index is more
sensitive that the Rand index to measure agreement between two partitions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
As one can see from Table 3, the partitions Cw and Cop are similar to each other,
while the partition Cb differs from the partitions Cw and Cop considerably.
      </p>
      <p>Key actor analysis. Next to the analysis described above, we identified key
social actors. We used the comparison of relative values of eigenvector centrality
and betweenness centrality. The betweenness centrality gives a higher score to
a vertex that sits on many shortest path of other vertex pairs, and it centrality
usually refers to the access to novel information and control benefits. Eigenvector
centrality gives a higher score to a vertex if it connects to many high score
vertices. We calculated the linear regression model, Figure 5 shows a scatter plot
of Eigenvector centrality as a function of Betweenness centrality. The equation
for the line in Figure 5 is y = 0.0102x + 0.2833 (red line), this linear model was
significant (F = 14.118, p-level = 0.0014 &lt; 0.05).</p>
      <p>
        Figure 5 shows each vertex’s relative value of eigenvector centrality and
betweeness, scaled by the value of the regression residuals, labels of actors scaled
by the absolute value of residuals. D. Conway [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] has found that people with
low eigenvector centrality but high betweenness centrality are important gate
keepers between teams (actors A08, A11), while people (actors A14, A17) with
high eigenvector centrality but low betweenness centrality has direct contact to
important people (actors A12, A13, and A19).
      </p>
      <p>The results of the community structure identification (Figure 3) and key
actor analysis (Figure 5) were discussed with the teachers and students. A
frequently mentioned disadvantage of the community structure based on the
edge betweenness algorithm was providing more unbalanced partition than
walktrap and optimal algorithms. The statistical characteristics of community sizes
and their variations are: M = 4.0, SD = 5.65 (edge betweenness algorithm),
M = 5.0, SD = 2.16 (walktrap), M = 5.0, SD = 1.36 (optimal algorithms).</p>
      <p>Teachers recommended to separate the key actors A12 and A19 into the
different teams. According to the community structure identification the following
options are available: to move a key actor from C2w to C1w or from Cop to Cop
1 4
(Figure 3). The conducted SNA modelling leads to the next result: the
compromise between the modularity and the conductance is to move the actor A19 from
C2w to C1w. In this case, as we expected, the modularity score decreased from
0.314 to 0.304, while the conductance increased from 0.22 to 0.263.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Effect of Network Structure on Academic Performance</title>
      <p>As a result, four project teams were formed after the midterm (9th week of the
fall semester 2015): T1 = {A08, A10, A11, A19}, T2 = {A04, A05, A07, A09, A12,
A17, A18}, T3 = {A06, A13, A14, A15, A16}, and T4 = {A01, A02, A03, A20}. To
address research question Q2 we examined an influence of teams’ interactions
over time on the academic performance.</p>
      <p>In our experiment, during the second period of the term (from 10th to 18th
weeks) students from the experimental group (gexp, n1 = 20) additionally were
meeting with the peers of their team once a week during a 2 hours tutorial in
the classroom, and they worked on a project. It is notable that students cannot
change teams during the experiment. A control group (gcont, n2 = 21) is not
splitted into project teams and does not have any additional meetings.</p>
      <p>Measurement of academic performance on the experimental group and the
control group was collected at two time points: a) at the midterm (p1 = 9 weeks),
b) at the end of term (p2 = 9 weeks). We received this information directly from
the Education Office. In both groups the each student could earn 480 points
during the fall semester of 2015 (50% at the midterm, 50% at the end of the
term). In our experimental group, average overall performance at the midterm
was M = 122.7 points, SD = 26.96, range R = [45, 163].In the control group,
average overall performance at the midterm was M = 140.09 points, SD = 22.14,
range R = [97, 182].</p>
      <p>Figure 6 gives descriptive statistics of academic performance (in points) to
the experimental and control groups as well as the each team. There are outlier
points in the dataset, these observations correspond to the points of the actor
A04, who dropped out from the experiment after 11th week.</p>
      <p>Firstly, the Shapiro-Wilk test was applied to check whether the distribution of
dependent variable came from a normally distributed population. The dependent
variable is the number of points at time points (at the midterm and at the end
of term). At .05 significance level the null hypothesis was rejected and there
is evidence that the distribution of points in the experimental group (W =
0.87, p-value = 0.017 &lt; 0.05) is not from a normally distributed population,
while the distribution of points in the control group (W = 0.97, p-value =
0.73 &gt; 0.05) is from a normally distributed population. Hence, we decided to
use the non-parametric statistics. It is clear from Figure 6 that the median of
the experimental group (gexp) at the first time point (p1) is less than the median
of the control group (gcont) and vice versa at the second time point (p2). At
the first data point (p1) in the experimental group the median was 128.5 points,
while in the control group the median was 144 points, at the second data point
(p2) the experimental median was 205, the control median was 189.</p>
      <p>We applied the Mann-Whitney-Wilcoxon criteria to test of the null
hypothesis that students from the control group tend to have the larger value of academic
performance (in points) than students from the experimental group. At the .05
significance level, the null hypothesis (U = 137.5, p-value = 0.058 &gt; 0.05) was
accepted at the first time point. We conducted a randomization test of no
difference in population medians (null hypothesis) against a two tailed alternative,
where the difference in sample medians is the test statistic. We created 5000
randomizations of the n1 + n2 = 20 + 21 = 41 observations. The two tailed
probability under the null hypothesis is p-value = 0.0376 &lt; 0.05, and 95% confidence
limits are −14.01 and 14.0. The obtained median difference was −15.5 points,
which clearly falls outside the interval. Thus we can reject the null
hypothesis, and conclude that the median points of the control group is significantly
greater than the median points of the the experimental group. At the second
data point (p2) we repeated the randomization test for comparing two medians,
95% confidence limits are −16.5 and 16.5. The obtained median difference was
16.0 points, which clearly falls inside the interval. We can not reject the null
hypothesis, and we can expect that the influence of teams’ interactions on the
individual academic performance is positive.</p>
      <p>The next set of statistical tests was applied to the experimental group. Firstly,
we need to test that none of k = 4 teams stochastically dominates one another.
The Kruskal-Wallis test was applied to decide whether the population medians
on a dependent variable are the same across all levels of a factor. The factor has
four levels: 1 = T1, 2 = T2, 3 = T3, and 4 = T4. The null hypothesis is that
the medians are equal across the teams. At .05 significance level, we conclude
that the medians are equal across the teams (χ2 = 2.0, p-value = 0.57 &gt; 0.05)
at the midterm. Secondly, for the comparison across repeated measures at the
midterm and the end of the term the Friedman’s test was used. It is used to
test for differences between the two snapshot data when the dependent variable
being measured is ordinal (ranks in our case). The null hypothesis that the
distributions are the same across repeated measures was rejected (χ2 = 16.2,
p-value = 5.7 · 10−5 &lt; 0.05). Hence, the distributions across repeated measures
are different. There is evidence of the influence of teams’ interactions on the
individual academic performance.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Summary</title>
      <p>Project teams are detected using various social network analysis algorithms.
The key actor analysis allows us to identify individuals who have the strongest
influence on other members of the group. The results of communities detection
can be used in the educational process but require discussions with teachers
and students. According to compromise between the SNA results and semantic
recommendations of teachers and students, we have chosen the basic algorithm,
and project teams were created. We found evidence of peer effects on academic
performance. In the experimental group as a whole, as well as in the detected
teams the academic performance increased in comparison with the control group.</p>
      <p>The further research of our longitudinal study can be continued in the
following directions. At first, it is community detection in terms of motifs, i.e. dyads,
triads (two or three students are only connected to each other) as a subgraph
with a fixed number of vertices and with a given topology. Such description
allows us to identify complexity levels of a project to each team and different
assessment methods of team performance. At second, it is an application of
qualitative analysis of relations inside and outside project teams and assessment of
potential predictive factors of relations.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Brandes</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Delling</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaertler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gorke</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoefer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikoloski</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wagner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>On modularity clustering</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>20</volume>
          (
          <issue>2</issue>
          ),
          <fpage>172</fpage>
          -
          <lpage>187</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Carrington</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scott</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wasserman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Models and Methods in Social Network Analysis</article-title>
          . Cambridge University Press, New York (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Clauset</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Newman</surname>
            ,
            <given-names>M.E.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Finding community structure in very large networks (</article-title>
          <year>2004</year>
          ), http://www.arxiv.org/abs/cond-mat/
          <year>0408187</year>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Conway</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Social network analysis in R (</article-title>
          <year>2009</year>
          ), http://files.meetup.com/ 1406240/sna_in_R.pdf
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Csardi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nepusz</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>The igraph software package for complex network research</article-title>
          .
          <source>InterJournal Complex Systems</source>
          ,
          <volume>1695</volume>
          (
          <year>2006</year>
          ), http://igraph.org
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Danon</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diaz-Guilera</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duch</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Comparing community structure identification (</article-title>
          <year>2005</year>
          ), http://arxiv.org/abs/cond-mat/
          <year>0505245v2</year>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Ertem</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veremyev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butenko</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Detecting large cohesive subgroups with high clustering coefficients in social networks</article-title>
          .
          <source>Social Networks</source>
          <volume>46</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fortunato</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Community detection in graphs</article-title>
          .
          <source>Physics Reports</source>
          <volume>486</volume>
          ,
          <fpage>75</fpage>
          -
          <lpage>174</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hubert</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arabie</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Comparing partitions</article-title>
          .
          <source>Journal of Classification</source>
          <volume>2</volume>
          (
          <issue>1</issue>
          ),
          <fpage>193</fpage>
          -
          <lpage>218</lpage>
          (
          <year>1985</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kolaczyk</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Csardi</surname>
          </string-name>
          , G.:
          <article-title>Statistical Analysis of Network Data with</article-title>
          R. Springer, New York (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qin</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yun</surname>
          </string-name>
          , H.:
          <article-title>A community detecting algorithm in directed weighted networks</article-title>
          .
          <source>Series Lecture Notes in Electrical Engineering (98)</source>
          ,
          <fpage>11</fpage>
          -
          <lpage>17</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Lomi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Snijders</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steglich</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Torly</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Why are some more peer than others? Evidence from a longitudinal study of social networks and individual academic performance</article-title>
          .
          <source>Social Science Research</source>
          <volume>40</volume>
          (
          <issue>6</issue>
          ),
          <fpage>1506</fpage>
          -
          <lpage>1520</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Newman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girvan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Finding and evaluating community structure in networks</article-title>
          .
          <source>Phys. Rev. E</source>
          <volume>69</volume>
          (
          <issue>2</issue>
          ),
          <volume>26113</volume>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Pijl</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koster</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hannink</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stratingh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Friends in the classroom: a comparison between two methods for the assessment of students' friendship networks</article-title>
          .
          <source>Soc Psychol Educ (14)</source>
          ,
          <fpage>475</fpage>
          -
          <lpage>488</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Pons</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Latapy</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Computing communities in large networks using random walks (</article-title>
          <year>2005</year>
          ), http://arxiv.org/abs/physics/0512106
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Pronin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veretennik</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Semyonov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Formirovanie uchebnyh grup v universitete s pomoshju analiza socialnyh setej</article-title>
          .
          <source>Voprosy obrazovanija (3)</source>
          ,
          <fpage>54</fpage>
          -
          <lpage>74</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Rand</surname>
            ,
            <given-names>W.M.:</given-names>
          </string-name>
          <article-title>Objective criteria for the evaluation of clustering methods</article-title>
          .
          <source>Journal of the American Statistical Association</source>
          <volume>66</volume>
          ,
          <fpage>846</fpage>
          -
          <lpage>850</lpage>
          (
          <year>1971</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Rienties</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heliot</surname>
            ,
            <given-names>Y.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jindal-Snape</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Understanding social learning relations of international students in a large classroom using social network analysis</article-title>
          .
          <source>High Education (66)</source>
          ,
          <fpage>489</fpage>
          -
          <lpage>504</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Rosvall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergstrom</surname>
          </string-name>
          , C.T.:
          <article-title>Maps of information flow reveal community structure in complex networks</article-title>
          .
          <source>PNAS</source>
          <volume>105</volume>
          (
          <issue>4</issue>
          ),
          <fpage>1118</fpage>
          -
          <lpage>1123</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leskovec</surname>
          </string-name>
          , J.:
          <article-title>Defining and evaluating network communities based on ground-truth</article-title>
          .
          <source>In: Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics - MDS'12</source>
          . Beijing, China (
          <volume>12</volume>
          -
          <fpage>16</fpage>
          Aug
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Zaccaro</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rittman</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marks</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Team leadership</article-title>
          .
          <source>The Leadership Quarterly (12)</source>
          ,
          <fpage>451</fpage>
          -
          <lpage>483</lpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>