=Paper=
{{Paper
|id=Vol-2245/hufamo_paper_1
|storemode=property
|title=Modeling and Analyzing Information Flow in Development Teams as a Pipe System
|pdfUrl=https://ceur-ws.org/Vol-2245/hufamo_paper_1.pdf
|volume=Vol-2245
|authors=Jil Klünder,Oliver Karras,Nils Prenner,Schneider
|dblpUrl=https://dblp.org/rec/conf/models/KlunderKPS18
}}
==Modeling and Analyzing Information Flow in Development Teams as a Pipe System==
<pdf width="1500px">https://ceur-ws.org/Vol-2245/hufamo_paper_1.pdf</pdf>
<pre>
            Modeling and Analyzing Information Flow
            in Development Teams as a Pipe System
                                 Jil Klünder∗ , Oliver Karras∗ , Nils Prenner∗ , Kurt Schneider∗
                                                     ∗ Leibniz University Hannover,

                                                    Software Engineering Group,
                                                        Hannover, Germany,
                       Email: {jil.kluender, oliver.karras, nils.prenner, kurt.schneider}@inf.uni-hannover.de


    Abstract—Teamwork is essential for developing valuable soft-        directly between two persons, but it may also flow via different
ware. Working in a team requires an appropriate information             persons before reaching the required team member.
exchange among team members in order to avoid loss of                      In this paper, we present the metaphor of a pipe system
information. In order to analyze and improve information flows,
it is recommended to observe the information exchange in a team.        to consider the wide range of ways to transport information –
We propose an approach for modeling the information flow of             and to facilitate the analysis of the information flow. The pipe’s
teams as a pipe system. Different pipe diameters represent the          diameter resembles the amount of information transmitted per
amount of information passing through the pipe. In order to             time, e.g., via chat, phone, or face-to-face. Water in a pipe
show the applicability of our approach, we conducted a case             system can take different ways from a source to a target, but
study in a globally distributed software engineering company.
The study consists of the elicitation of information flows inside       it is limited by the smallest pipe on the way. The same holds
the company and the automated analysis by our approach. We              for information: The maximum amount of information flowing
are able to visualize the information flows, find critical paths such   between two persons is limited by the minimum amount of
as bottlenecks, and improve information flow structures. This           information that can be transported on each part on the way
enables project leaders to customize the communication structure        from the sender to the receiver of information.
to the needs of their team and to prevent loss of information.
    Index Terms—Communication behavior, information sharing,               We aim at visualizing a developer network with different
development teams, developer network, social network analysis           kinds of pipe diameters representing the amount of information
                                                                        flow. The pipe system visualizes information flow within the
                                                                        network and helps to detect critical paths and bottlenecks. This
                       I. I NTRODUCTION
                                                                        facilitates the developers’ understanding of the importance of
Communication and information exchange are two essential                adequate information sharing. Furthermore, central nodes can
parts of daily work in software development teams [1]. An               be easily recognized in the network. These central nodes are
inadequate amount of communication is one of the main                   of particular importance for information sharing since they
obstacles hampering successful collaboration [2], [3]. There            bundle a lot of information [13].
exist different types of information such as customer needs                Despite the theoretical orientation of our paper, we apply
or details of story cards that need to be shared within the             our approach in an industrial case study with a globally
team [4]. This intense communication takes place via different          distributed software developing company.
communication channels, such as face-to-face, e-mails, chat                The remainder of this paper is structured as followed:
messages or documents [4], [5]. Developers and stakeholders             In Sec. II, we present related work. In Sec. III, we depict
are often not aware of the ongoing communication behavior               the groundwork of graph theory used in our approach and
and information flow [6], [7]. They frequently face the problem         the information flow analysis with FLOW. In Sec. IV, we
of instantaneously accessing the right information due to               introduce our approach for an automated analysis of FLOW
the distribution to different information stores. Information           diagrams based on pipe systems. Sec. V presents our case
flow is impeded if the developers do not know where to                  study. The results and limitations of our approach are discussed
find the required information. In turn, inadequate information          in Sec. VI. We conclude and present future work in Sec. VII.
sharing complicates teamwork. Analyzing information flow
helps a team to become aware of the communication and                                        II. R ELATED W ORK
information sharing [8]–[10]. Detecting critical issues helps           Our work focuses on quantifying information flow analysis.
to improve the teamwork and to prevent problems like lost               There are already existing approaches in this area. Durugbo et
information, missing functionality, and a dissatisfied customer.        al. [14] give an overview of existing approaches for modeling
There are different approaches to analyze the information flow          information flow. The authors review literature and distinguish
in projects [11], [12]. These approaches consider the existence         between approaches for diagrammatically (i.e. visually) and
or absence of an information flow. However, they take neither           mathematically (i.e. analytically) modeling information flow.
the amount nor the required time to transport information               According to the literature review of Durugbo et al. [14],
from one person to another into account. Information can flow           most visualizations are done using graphs or networks. Besides
graph theory and social network analysis, probability theory,       (u, v) ∈
                                                                           / E, we assume that there is no information flow, i.e.
vector analysis, Markov models and interaction matrices are         c(u, v) = 0. A flow in G is defined to be a real-valued function
also applied to analyze the information flows [14]. The choice      f : V × V → R satisfying
of visualization and analysis approaches often depends on the         1) Capacity constraint: ∀u, v ∈ V let f (u, v) ≤ c(u, v), i.e.
level of application: It is possible to analyze information flows        the flow is always less or equal to the capacity
on a macro, meso, and micro level [15]. Our concepts go along         2) Skew symmetry: ∀u, v ∈ V let f (u, v) = −f (v, u), i.e.
with Durugbo et al.’s [14] findings which are not tailored for           going in the opposite direction leads toPa negative flow
software development teams.                                           3) Flow conservation: ∀u ∈ V \{s, t}, let vu∈V f (u, v) =
   Smith [16] provides an overview of the foundations of                 0, i.e. there is theoretically no loss of information in a
quantitative information flow. He mainly considers the flow of           node; the amount of incoming information is equal to the
secure information. His approach is based on the concept of              amount of outgoing information
vulnerability, which is closely related to Bayes risk measuring     Given a flow and a capacity function, we can define the
uncertainty [16]. The author focuses on sensitive information       residual capacity, which is given by the difference between
as a security issue. His approach is not applicable to our          the actual flow between to nodes and the capacity of the edge
information flow analysis which is based on social interactions.    connecting them. Formally, the residual capacity is defined to
   Lowe [17] presents an approach of quantifying information        be
flow by considering the capacity of a covert channel in a                                
security system. His approach is based on the Communicating                              c(u, v) − f (u, v) if (u, v) ∈ E
                                                                                         
Sequential Processes algebra describing interactions between                cf (u, v) = f (v, u)               if (v, u) ∈ E
                                                                                         
various kinds of communicating processes. Lowe [17] consid-                              
                                                                                           0                   otherwise
ers an information flow between two users, High and Low,
using a covert channel. He wants to quantify the amount of          Then, the residual network Gf = (V, Ef ) induced by a flow f
information flowing from High to Low. Lowe’s [17] approach          is given by the edges in Ef = {(u, v) ∈ V ×V : cf (u, v) > 0}.
is more technical due to the use in security and not based on       Thus, the residual network consists of all nodes of the initial
human factors.                                                      network and the edges are given by those with a not yet
   Kiesling et al. [13] combine the FLOW method for in-             maximal flow. An augmenting path p is a simple path from s
formation flow analysis with social network analysis. They          to t in the residual network Gf . One may show that each edge
apply various centrality measures which are well established        on such a path in the residual network admits some additional
in sociology and psychology. These centrality measures de-          positive flow from u to v without violating the capacity con-
tect central persons who are very important for information         straint on the edge [18]. The residual capacity of an augment-
sharing. The authors consider degree centrality counting the        ing path p is defined to be the maximum amount by which the
number of incoming and outgoing edges, closeness centrality,        flow on each edge in p can be increased without violating the
betweenness and flow betweenness centrality. The different          capacity criteria, i.e. cf (p) = min{cf (u, v) : (u, v) is on p}.
measures indicate which persons are central from different             We aim at calculating the maximum information flow f (s, t)
points of view or which are well situated in the network            between a source s ∈ V and a target t ∈ V . Therefore, we
to share and receive information very fast and easily. This         apply the Ford-Fulkerson-Method. Algorithm 1 visualizes the
approach also helps to quantify information flow analysis           proceeding.
since certain centrality measures underline the qualitative
                                                                    Algorithm 1 FORD-FULKERSON-Method [18, p. 724]
results. Compared to our approach, Kiesling et al. [13] only
consider whether there is an edge between two nodes, i.e. an         1: for each edge (u, v) ∈ G.E do
information exchange takes place. In contrast, we consider the       2:   f (u, v) = 0
amount of information which flows over an edge.                      3: end for
                                                                     4: while there exists a path p from s to t in the residual
                      III. BACKGROUND                                   network Gf do
Our approach of modeling information flow is based on graph          5:   cf (p) = min{cf (u, v) : (u, v) ∈ p}
theory and extends the FLOW method. In the following, we             6:   for each edge (u, v) ∈ p do
present the necessary basics to understand our approach.             7:      if (u, v) ∈ E then
                                                                     8:         f (u, v) = f (u, v) + cf (p)
A. Graph Theory                                                      9:      else
We use the Ford-Fulkerson-Method to realize our approach            10:         f (v, u) = f (v, u) − cf (p)
of considering a developer network as pipe system [18]. We          11:      end if
consider a finite, directed graph G = (V, E) with a set of          12:   end for
nodes V and a set of edges E ⊆ V × V . This graph grasped           13: end while
as a network is called flow network [18]. Let c : V ×V → R≥0
be a non-negative function assigning a capacity c(u, v) to each     In the beginning, the flow of each edge within the network
edge (u, v) ∈ E. This function is called capacity function. For     is initialized with zero. Then, we search for a path p from
the source s to the target t in Gf in the residual network Gf       communication) is usually bidirectional. If an information flow
induced by the current flow value. We calculate the minimum         is bidirectional, i.e. information flows from A to B as well as
of all residual capacities, cf (p) on p, and increase the flow of   from B to A, this is visualized by an arrow pointing in both
each edge in E by this cf (p). If an edge is not contained in       directions.
the initial network, we decrease the contrary edge (v, u) by
the same value. This proceeding is well-defined and leads to
the requested results [18, p. 714 ff.].
   The Edmond-Karp-Algorithm implements the Ford-
Fulkerson-Method with polynomial runtime. This algorithm
searches for the augmenting path (line 4 in Alg. 1) by using
breadth-first-search to find the shortest augmenting path from
s to t. This is helpful since the distance of the shortest path
from s to any other node v 6= t in the residual network
increases monotonically with each flow augmentation [18, p.
727]. Using this implementation decreases the runtime, since
the total number of flow augmentations is in O(V E) [18, p.
729]. The total runtime is O(V E 2 ).
B. Information flow analysis with FLOW                                          Figure 1: Exemplary FLOW diagram [20]
FLOW is an established method to analyze and improve
the communication in software projects [12]. It is a system-        Figure 1 visualizes the activity Sprint Planning with incoming
atic analysis of information flows in software development          information from the Scrum Master, the Product Owner, the
projects, but it is also applicable for other kinds of projects.    Scrum Team and the Backlog, which is usually documented on
FLOW helps to detect lacks or anomalies in order to avoid a         a task board or in a ticket system and hence solid. The result of
loss of information by providing a structured proceeding for        the task are Story Cards for the next Sprint. The Template for
evaluating, visualizing, analyzing and improving information        Story Cards controls the task and Trello1 supports the activity.
flow in teams [12], [19].                                              FLOW diagrams mostly comprise various patterns. During
   FLOW distinguishes between two types of information              the analysis, these patterns need to be found in order to
flows and stores: solid and fluid [19]. Solid information is        detect weaknesses. For example, a long chain of fluid informa-
long-term and repeatably accessible and can be understood           tion stores (also referred to as Chinese-whisper-pattern) [21]
by third parties with domain knowledge. An information flow         should be avoided since the information may change during
is defined to be fluid, whenever one of these three criteria is     the transfer due to misunderstandings. Another example for a
not met [12]. Solid information is usually captured in writ-        pattern is the Competence Spider [22]. This is a person that
ten documents, source code or other long-living stores [12].        receives and shares a lot of information. An absence of such
Examples for fluid information are undocumented meetings,           a person endangers the information flow.
informal face-to-face communication and implicit knowledge             Afterwards, the analyst presents his findings and discusses
[10]. Emails may be either fluid or solid, depending on the         possible improvements with the project team. He further sug-
further use and storing of the email [10].                          gests how the team can use its resources more appropriately.
   Analysts conduct interviews with important members of                         IV. Q UANTIFYING I NFORMATION F LOW
the project to elicit the communication behavior. The group
                                                                    In order to support the FLOW analyst during the analysis
of interviewed persons often consists of the team leader,
                                                                    as well as to allow project leaders and team members to
one or two developers and other important stakeholder or
                                                                    analyze the FLOW diagram, we want to support the analysis
contributor to the team’s work. Each interview starts with a
                                                                    with quantitative measures and visualizations. In order to
short introduction of the interviewee to give an overview of the
                                                                    implement our metaphor of a pipe system, we have to make
main tasks. For each task, the interviewee names the involved
                                                                    some assumptions which we present in this section.
persons and the required information (input), the working
products (output), the supporting artifacts such as templates,      A. Weighting Information Flow
standards or tools [13].
                                                                    In a first step, we define the diameter of the pipes which are
   The results of the FLOW interviews are visualized in a
                                                                    given by weights for the edges. The weights represent the
FLOW diagram. Figure 1 presents the main information stores
                                                                    amount of information flowing between two nodes [23]. A
(faces for fluid stores and documents for solid stores). The
                                                                    small weight defines a small flow capacity whereas a large
rectangle represents an activity summarizing parts of the dia-
                                                                    weight defines a large flow capacity.
gram that are too fine-grained for the visualization or that are
                                                                       Considering a FLOW diagram, we have to distinguish
not further defined during the interviews. Note that the FLOW
                                                                    between four cases without activities and four cases with
diagram is a directed network since information mainly flows
from one person to another, even if the interaction (mostly           1 https://trello.com/
activities: Each combination of fluid and solid stores (solid      communicate” [24]. However, it is also possible to reduce the
→ solid, solid → fluid, fluid → solid, fluid → fluid) and          activities as described in the following.
combinations with an activity (fluid → activity, solid →
activity, activity → fluid, activity → solid). We define the       B. FLOW Diagrams as Pipe Systems
amount of information flow between two information stores          We aim at applying the Edmond-Karp-Algorithm of the Ford-
according to their states as represented in Table I. We define     Fulkerson-Method for calculating the maximum flow within
the edge weights representing the amount of information            a network to the FLOW diagram. In order to apply this
flow between two nodes to range between 0 and 1, where             algorithm, we first have to adjust the FLOW diagram, since
0 represents no information exchange.                              the algorithm only considers nodes and edges, but no activities
                                                                   as in FLOW. As presented by Kiesling et al. [13], there are
Table I: Edge weights representing the amount of information       mainly three possibilities of transforming a FLOW diagram
flow between two nodes in developer networks                       into a network:
                   Source   Receiver   Weight                        1) Connect all incoming and outgoing stores of the activity.
                    Fluid    Fluid      0.7                          2) Represent the activity with a separate node and connect
                    Fluid    Solid      0.5                             all incoming and outgoing nodes with this one by pre-
                    Solid    Fluid      0.4                             serving the direction of flow.
                    Solid    Solid      0.2
                                                                     3) Decide for each incoming node whether it should be
                                                                        directly connected to an outgoing one or to a separately
The weights in Table I require some assumptions:
                                                                        defined one. This case requires deeper insights into the
   • fluid → fluid: Transporting information directly and per-
                                                                        activity.
      sonally enables a good flow of information because the
      receiver is able to directly ask questions in order to       There are some prerequisites deciding which case is most
      understand what the source tells him.                        suitable.
   • fluid → solid: Information sharing from a fluid store to         • Case 1 implies that there is an information flow from

      a solid one like writing something down, is worse than            each incoming to each outgoing node involved in the
      information sharing between persons, since the source             activity. For example, in case of a meeting, this might
      cannot write everything down or may forget implicit               be correct. However, activities often represent a further
      knowledge. Hence, the receiver has less information than          communication network where the information does not
      the source.                                                       necessarily flow from each incoming to each outgoing
   • solid → fluid: Reading is one example for transform-               node.
      ing information from a solid store to a fluid one. The          • Case 2 is the most neutral way of integrating an activity

      amount of information received by reading the document            within a network. After having analyzed the network, it
      strongly depends on the receiver’s knowledge. Missing or          might be possible to re-transfer the network into a FLOW
      not understandable information may not be retrieved by            diagram and adapt the results for analyzing the diagram.
      reading the document twice. The only way of receiving             In case of doubts, we recommend using this possibility.
      missing specific information is talking to the source of        • Case 3 is the most exact way of dissolving an activity.

      the document.                                                     This case requires deep insights into the activity that
   • solid → solid: Information sharing from a solid store              cannot always be achieved.
      to another solid store is given by copying documents         Depending on the choice of the case, there are different
      or changing the file format. However, since there is no      interpretations of the resulting network. In order to connect
      human involved in the process, we assume a very low          both approaches, i.e. the FLOW analysis and the algorithmic
      “collaboration” between these documents.                     extension, it is required not to change the core statements of
These assumptions imply the decreasing weights in Table I. In      the network. Hence, the transformation method needs to be
the case of information flows with involved activities, we have    carefully selected for each activity.
to make different assumptions. We know either the source or           After this step, we receive a FLOW network only consisting
the receiver, but one of them is an activity, which is, first of   of information stores and information flows, i.e. nodes and
all, a black box. Hence, we do not know the granularity of         edges. We are now able to apply the algorithm to the network.
the information flow, i.e. if it is solid or fluid. Sometimes,
knowledge about the activity helps to decide whether an            C. Quantitative Measures
information flow is solid or fluid, but in many cases, this        Based on the pipe system and the use of the Edmond-Karp-
decision is not possible.                                          Algorithm, we calculate the maximum flow between two
   We consider information flows within activities as the worst    nodes. Comparing the results between different nodes and
case and choose the minimum edge weight which is given             considering the paths of the maximum flow helps to detect
by the information flow between two solid stores, i.e. 0.2, in     central persons, i.e. persons who are very important for the
case of doubts. We assume that information flows during each       information flow. A person can be central from different
activity, since according to Watzlawick et al. “one cannot not     viewpoints. Social network analysis is a wide field providing
               Table II: Overview of centrality measures and their importance for information flow analysis [13]
    Centrality Measure         Influenced by                                            Persons with a high degree...
    Degree Centr.              incoming resp. outgoing edges                            receive resp. share a lot of information
    Closeness Centr.           the average distance to each other node                  obtain novel information early
    Betweenness Centr.         the location on the shortest paths in the network        need to share many urgent information
    Flow Betweenness Centr.    the location on all paths in the network                 coordinate the information flow
    Eigenvector Centr.         the number of neighbors who are also central             can share important information in a very short time with
                                                                                        the whole network


different measures for centrality [23]. Kiesling et al. [13] apply                         V. C ASE S TUDY IN I NDUSTRY
commonly used centrality measures to the nodes of a FLOW
diagram. Table II summarizes some centrality measures and                  We present a preliminary study with a globally distributed
explains their relevance for information flow analysis. All                software engineering company [22] to demonstrate the appli-
of these centrality measures are local ones, i.e. they can be              cability of our approach.
calculated for a single node.
   These measures help to detect critical issues in the network
[13]. Some of them are obvious in the FLOW diagram, e.g.                   A. FLOW Analysis
the degree centrality. One only needs to count the incoming
                                                                           According to the proceeding presented in Sec. III-B, we have
and outgoing edges of a node and the nodes that have a lot of
                                                                           interviewed the director of the company. In this interview,
them are central in the sense of degree centrality. The persons
                                                                           we gained profound knowledge about the information sharing
represented by the nodes seem to have a certain importance
                                                                           behavior. The FLOW diagram after the first interview is rather
for the information flow process. However, other measures
                                                                           small and clear. In order to present the proceeding of our
such as the closeness centrality are difficult to identify in a
                                                                           quantitative analysis, it is sufficient and better to use a rather
network without having calculated the measures. We provide
                                                                           small, but clear FLOW diagram.
a visualization for the closeness centrality which measures the
                                                                              In the first interview, we collected data about the overall
average distance of a node to all other nodes in the network
                                                                           corporate structure, the hierarchies and the communication
in order to facilitate understanding of persons who are central
in the sense of closeness centrality. We call this visualization           behavior within and across the teams. We talked to the director
network expansion since we highlight all nodes that can be                 who has a good overview of all processes. The whole interview
                                                                           took about two hours. It started with some demographics about
reached by a certain node after 1, 2, 3, or more steps. The
                                                                           the director, his experiences and his background. Afterwards,
more nodes a node can reach within a few steps, the higher
                                                                           he started describing the overall process. We took notes on
the closeness centrality. It remains future work to provide
                                                                           the main activities and asked him about the results, incoming
visualizations for the other centrality measures that are not
easy to determine.                                                         information, supporting tools and persons, and controlling
                                                                           elements such as templates. In the end, we summarized the
D. Analyzing the resulting network                                         results of the interview, i.e. the information on the main
We implemented a prototype depicting the transformed FLOW                  activities during the process to avoid misunderstandings.
diagram and then calculating the maximum information flow                     We visualized the findings based on the interviews as a
from a source to a target. Figure 4 represents a screenshot of             FLOW diagram. The result of the interview is presented
the tool. All fluid information stores are visualized as circles           in Figure 2 [22]. The developers work in Spain, while the
and all solid stores are rectangles. Currently, activities are             customer and the consultants are in Germany. To coordinate
represented as squares but have the same properties as the                 the communication across borders, each team has a team
other nodes. The distinction between the information stores                leader who exchanges information with the other team leader.
only supports the intuitive comprehension of the network; the              A project starts with a workshop (see (1) in Figure 2). The
algorithm does not differ between the type of information                  director, the customer and both team leader participate in the
because this information is already contained in the pipe                  workshop. It helps to clarify basic conditions. The workshop
diameter. In Figure 4, the purple node in the upper left corner            results in a requirements catalog (2) documenting all require-
is the source and the orange node in the lower right-hand                  ments. Based on this requirements catalog, the consultants
corner is the target, i.e. the receiver of the node. The blue path         write story cards for the developers in Spain and create a
visualizes the best ways for information sharing according to              concept. A click-dummy (3) basically visualizes the idea for
the Edmond-Karp-Algorithm.                                                 the final product. The developers implement the story cards
   The tool also includes the visualization of the network                 and regularly exchange information with the consultants (4)
expansion (see Figure 3) and the calculation of the centrality             in Germany, who are in contact with the customer. In the end,
measures in Table II. To facilitate the interpretation of the              the customer receives the remaining software product to test
centrality measures, our tool is able to highlight the extreme             it and to express change requests. This step is not visualized
values, i.e. those differing from the other ones.                          in Figure 2.
                                                                   which is rather decentral. This node needs 4 steps to cover
                                                                   the whole network. The main reason for this finding is that
                                                                   the quality assurance only talks directly to the developer which
                                                                   can also be found in the network.
                                                                                           VI. D ISCUSSION
                                                                   In the following, we reflect on the limitations of our approach
                                                                   before interpreting and discussing our findings from the ex-
                                                                   ample above.
                                                                   A. Critical Appraisal
                                                                   Due to the basically conceptional approach in this paper, the
                                                                   presented idea has two deficiencies, which are not unimportant
                                                                   for the interpretation and the reliability of the results after the
                                                                   application of our approach: (1) the choice of the algorithm
   Figure 2: FLOW diagram after the first interview [22]           and (2) the weightings of the information flow.
                                                                      (1) The choice of the algorithm: There exist many different
                                                                   algorithms for calculating the maximum flow in a network.
B. Tool-Supported Analysis                                         Hence, there might be a better algorithm for our approach.
We consider exemplary the information flow from the team           We decided to use the Ford-Fulkerson-method which is widely
lead (purple node) to the developers (orange node) in Figure 4.    distributed in graph theory and which is even recommended by
The workshop, the requirements catalog and the click-dummy         Cormen [18] for considering information flows in a network.
lay on the path of the maximum flow from the team lead to             (2) The weightings of the information flows: The proposed
the developers. The most important node are the consultants        weightings in Table I underly some assumptions presented in
because they lay on each of the paths. If this node is missing,    subsection IV-A. For calculations, we had to postulate concrete
information would flow worse or even not at all from the           values. Although the assigned values seem arbitrary, they are
team leader of the consultants to the developers. The centrality   chosen carefully based on the assumptions presented in sub-
measures of this node (closeness: 0.08, betweenness: 0.84,         section IV-A. But slightly different values with a comparable
in-degree: 7, out-degree: 12) also support this fact. All of       scaling do not lead to completely different results. The range
them are highlighted, i.e. they differ remarkably from the         from 0 to 1 is based on the calculation of FLOW distance,
measures of all other nodes. The workshop (betweenness: 0.60,      which defines the amount of information flow in smaller teams
in-degree: 6, out-degree: 2) and the developers (betweenness:      and also ranges from 0 to 1 [4]. Nonetheless, the values
0.49, in-degree: 4, out-degree: 4) are also very important.        represent the current state of our research and may need to
   Considering the information expansion of the consultants        be adjusted later due to new findings. However, the basic idea
(central node) and the quality assurance (decentral node) in       remains valid also with slight adjustments. Nonetheless, we
Figure 3 illustrates the benefit of this measure. Figure 3a        do not consider human factors influencing the information
visualizes the information expansion of the consultants that       flow such as a node’s absorption of information. The ability
only need 2 steps to cover the whole network. Figure 3b            of a person to absorb and internally handle information can
presents the information expansion of the quality assurance        increase or decrease the amount of information to be shared


           (a) Information expansion of a central node                       (b) Information expansion of a decentral node
Figure 3: Stepwise information expansion. The colour of the starting node is red, turning towards blue with each additional
step. Furthermore, the diameter of the coloured circle around the node decreases.
Figure 4: Screenshot of our prototype calculating the maximum information flow between the purple (source) node and the
orange (receiver) node as well as common centrality measures


with other persons. Furthermore, we do neither consider the           In the given example, we figured out the importance of the
loss of information nor misunderstood information that are         consultants within the process. A FLOW analyst recognizes
falsely shared. We will consider these aspects, in particular      the consultants as so-called ”competence spider” because they
”personalized” weightings, in further research.                    are involved in many information flows. As evident from the
B. Interpretation                                                  FLOW diagram, there are many incoming and outgoing edges
                                                                   from and to the consultants. Our pipe system supports the
Information flow in software development teams or companies        relevance of the consultants. The underlying data confirm that
is a very complex behavior which requires a manual analysis.       the consultants are at least mandatory for a fast information
Some structures or patterns in the FLOW diagram are too in-        flow from the team leader to the developers. Without the con-
terwoven and subliminal to be detected by a simple algorithm.      sultants managing and coordinating the information sharing,
   We want to support the information flow analysis and
                                                                   an information flow from the team leader to the developers
increase the developers’ awareness of the complexity of infor-
                                                                   cannot be guaranteed. Furthermore, the centrality measures
mation sharing by presenting an approach of metaphorically
                                                                   and the information expansion underline the importance of
considering a developer network as a pipe system. Hence, we
                                                                   the consultants objectively. This objectivity may also help to
quantify information flow analysis to apply the algorithm. This
                                                                   increase the awareness of the involved persons. A qualitative
helps to increase the trustworthiness of the findings due to
                                                                   analysis is always subjective, but the calculated measures
underlying data. Furthermore, it decreases the possibility of
                                                                   are objective and the results easier to comprehend. However,
missing interpretations and forgotten obvious things. It helps
                                                                   there is some analysis required to support this assumption.
to make the subjective analysis more objective.
                                                                   The qualitative and the quantitative method supplement each
   Our case study reflects the advantages of the interplay of
                                                                   other. Nonetheless, the quantitative method needs to be refined
both qualitative and quantitative information flow analysis. The
                                                                   and extended to increase the advantages and to evaluate the
qualitative method finds complicated structures and patterns
                                                                   practicability.
such as the “Chinese-whisper-pattern”. The quantitative analy-
sis detects the same issues and – after adaptions and extensions             VII. C ONCLUSION AND F UTURE W ORK
– it may help to increase the developers’ awareness for the        Information exchange is important in software development
current state of information flow by visualizing and simulating    teams. Furthermore, information needs to be shared between
the results. It currently detects ways of information flow that    the customer, the team, project leader and many other different
are too long to be suitable or that are duplicated. However,       involved persons.
the analysis needs to be sharpened and extended to gain real          One strategy to analyze information flow is the FLOW
benefits from the quantifications.                                 method which aims at observing, examining, and improving
information flow in teams and companies. Until now, the                       [5] T. Niinimäki, A. Piri, and C. Lassenius, “Factors affecting audio and
FLOW method is a basically subjective method and the results                      text-based communication media choice in global software development
                                                                                  projects,” in Proceedings of the 4th IEEE International Conference on
depend on the analyst’s experience and knowledge.                                 Global Software Engineering. IEEE, 2009, pp. 153–162.
   In this paper, we presented the conceptional idea of meta-                 [6] D. Damian, S. Marczak, and I. Kwan, “Collaboration patterns and
                                                                                  the impact of distance on awareness in requirements-centred social
phorically considering the developer network as a pipe system.                    networks,” in Proceedings of the 15th IEEE International Requirements
We defined the amount of information flow between two nodes                       Engineering Conference. IEEE, 2007, pp. 59–68.
in the network based on the states of information flow provided               [7] J. D. Herbsleb, “Global Software Engineering: The Future of Socio-
                                                                                  Technical Coordination,” in Future of Software Engineering. IEEE
by FLOW. Information flow can then be analyzed based                              Computer Society, 2007, pp. 188–198.
on established algorithms in graph theory. In this approach,                  [8] E. Bjarnason, K. Wnuk, and B. Regnell, “Requirements are slipping
we use the Ford-Fulkerson-method which is widely used to                          through the gaps a case study on causes & effects of communication
                                                                                  gaps in large-scale software development,” in Proceedings of the 19th
calculate the maximum flow in a flow network.                                     IEEE International Requirements Engineering Conference. IEEE, 2011,
   We applied our approach to a software development com-                         pp. 37–46.
pany which results after the first interview have already been                [9] B. Bruegge, A. H. Dutoit, and T. Wolf, “Sysiphus: Enabling informal
                                                                                  collaboration in global software development,” in Proceedings of the
analyzed corresponding to the FLOW method [22]. We were                           International Conference on Global Software Engineering. IEEE, 2006,
able to detect some findings which are rather obvious (e.g.                       pp. 139–148.
the important role of the consultants) and have also been re-                [10] K. Stapel and K. Schneider, “Managing Knowledge on Communication
                                                                                  and Information Flow in Global Software Projects,” Expert Systems,
trieved in the qualitative analysis. However, we can objectively                  vol. 31, no. 3, pp. 234–252, 2014.
support these findings by presenting centrality measures.                    [11] B. S. Caldwell and N. C. Everhart, “Information flow and development
   In future, we will evaluate the influence of our approach                      of coordination in distributed supervisory control teams,” International
                                                                                  Journal of Human-Computer Interaction, vol. 10, no. 1, pp. 51–70, 1998.
on the developers, i.e. if our approach increases the awareness              [12] K. Schneider, K. Stapel, and E. Knauss, “Beyond documents: visualiz-
for the relevance of information sharing. Furthermore, we plan                    ing informal communication,” in Proceedings of the 3rd International
to graphically simulate information flow in the network and                       Workshop on Requirements Engineering Visualization. IEEE, 2008, pp.
                                                                                  31–40.
include further metrics to measure the quality of information                [13] S. Kiesling, J. Klünder, D. Fischer, K. Schneider, and K. Fischbach,
flow. Additionally, our automated approach with pipe systems                      “Applying social network analysis and centrality measures to improve
is still based on a manual elicitation phase of the informa-                      information flow analysis,” in Proceedings of the 17th International Con-
                                                                                  ference on Product-Focused Software Process Improvement. Springer
tion flows. This phase consists of interviews with project                        International Publishing, 2016, pp. 379–386.
members and drawing a flow diagram. This is highly time-                     [14] C. Durugbo, A. Tiwari, and J. R. Alcock, “Modelling information
consuming and requires an experienced analyst. In order to                        flow for organisations: A review of approaches and future challenges,”
                                                                                  International Journal of Information Management, vol. 33, no. 3, pp.
enable software companies to conduct a flow analysis on their                     597–610, 2013.
own without any expertise, we have to automate the elicitation               [15] M. Benson-Rea and S. Rawlinson, “Highly skilled and business mi-
phase of the flow analysis as well. At the moment, we deal                        grants: Information processes and settlement outcomes,” International
                                                                                  Migration, vol. 41, no. 2, pp. 59–79, 2003.
with this problem in future and ongoing research.                            [16] G. Smith, On the Foundations of Quantitative Information Flow. Berlin,
                  ACKNOWLEDGMENT                                                  Heidelberg: Springer Berlin Heidelberg, 2009, pp. 288–302.
                                                                             [17] G. Lowe, “Quantifying information flow,” in Proceedings of the 15th
This work was supported by the German Research Foundation                         IEEE Computer Security Foundations Workshop. IEEE, 2002, pp. 18–
(DFG) under grant number 263807701 (project TeamDynam-                            31.
                                                                             [18] T. Cormen, Introduction to Algorithms. MIT press, 2009.
ics, 2018-2020).                                                             [19] K. Stapel, E. Knauss, and K. Schneider, “Using FLOW to improve com-
                                                                                  munication of requirements in globally distributed software projects,” in
                            R EFERENCES                                           Collaboration and Intercultural Issues on Requirements: Communica-
 [1] B. Al-Ani and H. K. Edwards, “A comparative empirical study of               tion, Understanding and Softskills. IEEE, 2009, pp. 5–14.
     communication in distributed and collocated development teams,” in      [20] J. Klünder, C. Unger-Windeler, F. Kortum, and K. Schneider, “Team
     Proceedings of the 3rd IEEE International Conference on Global               meetings and their relevance for the software development process over
     Software Engineering. IEEE, 2008, pp. 35–44.                                 time,” in Proceedings of Euromicro Conference on Software Engineering
 [2] T. Wolf, A. Schröter, D. Damian, and T. Nguyen, “Predicting Build           and Advanced Applications, 2017.
     Failures Uing Social Network Analysis on Developer Communication,”      [21] K. Schneider and O. Liskin, “Exploring flow distance in project com-
     in Proceedings of the 31st International Conference on Software Engi-        munication,” in Proceedings of the 8th International Workshop on
     neering. IEEE Computer Society, 2009, pp. 1–11.                              Cooperative and Human Aspects of Software Engineering. IEEE Press,
 [3] J. D. Herbsleb, H. Klein, G. M. Olson, H. Brunner, J. S. Olson, and          2015, pp. 117–118.
     J. Harding, “Object-oriented analysis and design in software project    [22] J. Klünder and K. Schneider, “Information Flow in Distributed Software
     teams,” Human–Computer Interaction, vol. 10, no. 2-3, pp. 249–292,           Projects – A Case Study (orig.: Informationsfluss in verteilten Software-
     1995.                                                                        projekten - Eine Einzelfallstudie),” in PERSONALquarterly, 69(2), 2017,
 [4] J. Klünder, K. Schneider, F. Kortum, J. Straube, L. Handke, and             pp. 10–15.
     S. Kauffeld, “Communication in Teams - An Expression of Social Con-     [23] S. Wasserman and K. Faust, Social Network Analysis: Methods and
     flicts,” in Proceedings of the 6th International Conference on Human-        Applications. Cambridge university press, 1994, vol. 8.
     Centered Software Engineering and 8th International Conference on       [24] P. Watzlawick, J. B. Bavelas, and D. D. Jackson, Pragmatics of human
     Human Error, Safety, and System Development. Springer International          communication: A study of interactional patterns, pathologies and
     Publishing, 2016, pp. 111–129.                                               paradoxes. WW Norton & Company, 1967.

</pre>