=Paper=
{{Paper
|id=Vol-2245/hufamo_paper_1
|storemode=property
|title=Modeling and Analyzing Information Flow in Development Teams as a Pipe System
|pdfUrl=https://ceur-ws.org/Vol-2245/hufamo_paper_1.pdf
|volume=Vol-2245
|authors=Jil Klünder,Oliver Karras,Nils Prenner,Schneider
|dblpUrl=https://dblp.org/rec/conf/models/KlunderKPS18
}}
==Modeling and Analyzing Information Flow in Development Teams as a Pipe System==
Modeling and Analyzing Information Flow in Development Teams as a Pipe System Jil Klünder∗ , Oliver Karras∗ , Nils Prenner∗ , Kurt Schneider∗ ∗ Leibniz University Hannover, Software Engineering Group, Hannover, Germany, Email: {jil.kluender, oliver.karras, nils.prenner, kurt.schneider}@inf.uni-hannover.de Abstract—Teamwork is essential for developing valuable soft- directly between two persons, but it may also flow via different ware. Working in a team requires an appropriate information persons before reaching the required team member. exchange among team members in order to avoid loss of In this paper, we present the metaphor of a pipe system information. In order to analyze and improve information flows, it is recommended to observe the information exchange in a team. to consider the wide range of ways to transport information – We propose an approach for modeling the information flow of and to facilitate the analysis of the information flow. The pipe’s teams as a pipe system. Different pipe diameters represent the diameter resembles the amount of information transmitted per amount of information passing through the pipe. In order to time, e.g., via chat, phone, or face-to-face. Water in a pipe show the applicability of our approach, we conducted a case system can take different ways from a source to a target, but study in a globally distributed software engineering company. The study consists of the elicitation of information flows inside it is limited by the smallest pipe on the way. The same holds the company and the automated analysis by our approach. We for information: The maximum amount of information flowing are able to visualize the information flows, find critical paths such between two persons is limited by the minimum amount of as bottlenecks, and improve information flow structures. This information that can be transported on each part on the way enables project leaders to customize the communication structure from the sender to the receiver of information. to the needs of their team and to prevent loss of information. Index Terms—Communication behavior, information sharing, We aim at visualizing a developer network with different development teams, developer network, social network analysis kinds of pipe diameters representing the amount of information flow. The pipe system visualizes information flow within the network and helps to detect critical paths and bottlenecks. This I. I NTRODUCTION facilitates the developers’ understanding of the importance of Communication and information exchange are two essential adequate information sharing. Furthermore, central nodes can parts of daily work in software development teams [1]. An be easily recognized in the network. These central nodes are inadequate amount of communication is one of the main of particular importance for information sharing since they obstacles hampering successful collaboration [2], [3]. There bundle a lot of information [13]. exist different types of information such as customer needs Despite the theoretical orientation of our paper, we apply or details of story cards that need to be shared within the our approach in an industrial case study with a globally team [4]. This intense communication takes place via different distributed software developing company. communication channels, such as face-to-face, e-mails, chat The remainder of this paper is structured as followed: messages or documents [4], [5]. Developers and stakeholders In Sec. II, we present related work. In Sec. III, we depict are often not aware of the ongoing communication behavior the groundwork of graph theory used in our approach and and information flow [6], [7]. They frequently face the problem the information flow analysis with FLOW. In Sec. IV, we of instantaneously accessing the right information due to introduce our approach for an automated analysis of FLOW the distribution to different information stores. Information diagrams based on pipe systems. Sec. V presents our case flow is impeded if the developers do not know where to study. The results and limitations of our approach are discussed find the required information. In turn, inadequate information in Sec. VI. We conclude and present future work in Sec. VII. sharing complicates teamwork. Analyzing information flow helps a team to become aware of the communication and II. R ELATED W ORK information sharing [8]–[10]. Detecting critical issues helps Our work focuses on quantifying information flow analysis. to improve the teamwork and to prevent problems like lost There are already existing approaches in this area. Durugbo et information, missing functionality, and a dissatisfied customer. al. [14] give an overview of existing approaches for modeling There are different approaches to analyze the information flow information flow. The authors review literature and distinguish in projects [11], [12]. These approaches consider the existence between approaches for diagrammatically (i.e. visually) and or absence of an information flow. However, they take neither mathematically (i.e. analytically) modeling information flow. the amount nor the required time to transport information According to the literature review of Durugbo et al. [14], from one person to another into account. Information can flow most visualizations are done using graphs or networks. Besides graph theory and social network analysis, probability theory, (u, v) ∈ / E, we assume that there is no information flow, i.e. vector analysis, Markov models and interaction matrices are c(u, v) = 0. A flow in G is defined to be a real-valued function also applied to analyze the information flows [14]. The choice f : V × V → R satisfying of visualization and analysis approaches often depends on the 1) Capacity constraint: ∀u, v ∈ V let f (u, v) ≤ c(u, v), i.e. level of application: It is possible to analyze information flows the flow is always less or equal to the capacity on a macro, meso, and micro level [15]. Our concepts go along 2) Skew symmetry: ∀u, v ∈ V let f (u, v) = −f (v, u), i.e. with Durugbo et al.’s [14] findings which are not tailored for going in the opposite direction leads toPa negative flow software development teams. 3) Flow conservation: ∀u ∈ V \{s, t}, let vu∈V f (u, v) = Smith [16] provides an overview of the foundations of 0, i.e. there is theoretically no loss of information in a quantitative information flow. He mainly considers the flow of node; the amount of incoming information is equal to the secure information. His approach is based on the concept of amount of outgoing information vulnerability, which is closely related to Bayes risk measuring Given a flow and a capacity function, we can define the uncertainty [16]. The author focuses on sensitive information residual capacity, which is given by the difference between as a security issue. His approach is not applicable to our the actual flow between to nodes and the capacity of the edge information flow analysis which is based on social interactions. connecting them. Formally, the residual capacity is defined to Lowe [17] presents an approach of quantifying information be flow by considering the capacity of a covert channel in a security system. His approach is based on the Communicating c(u, v) − f (u, v) if (u, v) ∈ E Sequential Processes algebra describing interactions between cf (u, v) = f (v, u) if (v, u) ∈ E various kinds of communicating processes. Lowe [17] consid- 0 otherwise ers an information flow between two users, High and Low, using a covert channel. He wants to quantify the amount of Then, the residual network Gf = (V, Ef ) induced by a flow f information flowing from High to Low. Lowe’s [17] approach is given by the edges in Ef = {(u, v) ∈ V ×V : cf (u, v) > 0}. is more technical due to the use in security and not based on Thus, the residual network consists of all nodes of the initial human factors. network and the edges are given by those with a not yet Kiesling et al. [13] combine the FLOW method for in- maximal flow. An augmenting path p is a simple path from s formation flow analysis with social network analysis. They to t in the residual network Gf . One may show that each edge apply various centrality measures which are well established on such a path in the residual network admits some additional in sociology and psychology. These centrality measures de- positive flow from u to v without violating the capacity con- tect central persons who are very important for information straint on the edge [18]. The residual capacity of an augment- sharing. The authors consider degree centrality counting the ing path p is defined to be the maximum amount by which the number of incoming and outgoing edges, closeness centrality, flow on each edge in p can be increased without violating the betweenness and flow betweenness centrality. The different capacity criteria, i.e. cf (p) = min{cf (u, v) : (u, v) is on p}. measures indicate which persons are central from different We aim at calculating the maximum information flow f (s, t) points of view or which are well situated in the network between a source s ∈ V and a target t ∈ V . Therefore, we to share and receive information very fast and easily. This apply the Ford-Fulkerson-Method. Algorithm 1 visualizes the approach also helps to quantify information flow analysis proceeding. since certain centrality measures underline the qualitative Algorithm 1 FORD-FULKERSON-Method [18, p. 724] results. Compared to our approach, Kiesling et al. [13] only consider whether there is an edge between two nodes, i.e. an 1: for each edge (u, v) ∈ G.E do information exchange takes place. In contrast, we consider the 2: f (u, v) = 0 amount of information which flows over an edge. 3: end for 4: while there exists a path p from s to t in the residual III. BACKGROUND network Gf do Our approach of modeling information flow is based on graph 5: cf (p) = min{cf (u, v) : (u, v) ∈ p} theory and extends the FLOW method. In the following, we 6: for each edge (u, v) ∈ p do present the necessary basics to understand our approach. 7: if (u, v) ∈ E then 8: f (u, v) = f (u, v) + cf (p) A. Graph Theory 9: else We use the Ford-Fulkerson-Method to realize our approach 10: f (v, u) = f (v, u) − cf (p) of considering a developer network as pipe system [18]. We 11: end if consider a finite, directed graph G = (V, E) with a set of 12: end for nodes V and a set of edges E ⊆ V × V . This graph grasped 13: end while as a network is called flow network [18]. Let c : V ×V → R≥0 be a non-negative function assigning a capacity c(u, v) to each In the beginning, the flow of each edge within the network edge (u, v) ∈ E. This function is called capacity function. For is initialized with zero. Then, we search for a path p from the source s to the target t in Gf in the residual network Gf communication) is usually bidirectional. If an information flow induced by the current flow value. We calculate the minimum is bidirectional, i.e. information flows from A to B as well as of all residual capacities, cf (p) on p, and increase the flow of from B to A, this is visualized by an arrow pointing in both each edge in E by this cf (p). If an edge is not contained in directions. the initial network, we decrease the contrary edge (v, u) by the same value. This proceeding is well-defined and leads to the requested results [18, p. 714 ff.]. The Edmond-Karp-Algorithm implements the Ford- Fulkerson-Method with polynomial runtime. This algorithm searches for the augmenting path (line 4 in Alg. 1) by using breadth-first-search to find the shortest augmenting path from s to t. This is helpful since the distance of the shortest path from s to any other node v 6= t in the residual network increases monotonically with each flow augmentation [18, p. 727]. Using this implementation decreases the runtime, since the total number of flow augmentations is in O(V E) [18, p. 729]. The total runtime is O(V E 2 ). B. Information flow analysis with FLOW Figure 1: Exemplary FLOW diagram [20] FLOW is an established method to analyze and improve the communication in software projects [12]. It is a system- Figure 1 visualizes the activity Sprint Planning with incoming atic analysis of information flows in software development information from the Scrum Master, the Product Owner, the projects, but it is also applicable for other kinds of projects. Scrum Team and the Backlog, which is usually documented on FLOW helps to detect lacks or anomalies in order to avoid a a task board or in a ticket system and hence solid. The result of loss of information by providing a structured proceeding for the task are Story Cards for the next Sprint. The Template for evaluating, visualizing, analyzing and improving information Story Cards controls the task and Trello1 supports the activity. flow in teams [12], [19]. FLOW diagrams mostly comprise various patterns. During FLOW distinguishes between two types of information the analysis, these patterns need to be found in order to flows and stores: solid and fluid [19]. Solid information is detect weaknesses. For example, a long chain of fluid informa- long-term and repeatably accessible and can be understood tion stores (also referred to as Chinese-whisper-pattern) [21] by third parties with domain knowledge. An information flow should be avoided since the information may change during is defined to be fluid, whenever one of these three criteria is the transfer due to misunderstandings. Another example for a not met [12]. Solid information is usually captured in writ- pattern is the Competence Spider [22]. This is a person that ten documents, source code or other long-living stores [12]. receives and shares a lot of information. An absence of such Examples for fluid information are undocumented meetings, a person endangers the information flow. informal face-to-face communication and implicit knowledge Afterwards, the analyst presents his findings and discusses [10]. Emails may be either fluid or solid, depending on the possible improvements with the project team. He further sug- further use and storing of the email [10]. gests how the team can use its resources more appropriately. Analysts conduct interviews with important members of IV. Q UANTIFYING I NFORMATION F LOW the project to elicit the communication behavior. The group In order to support the FLOW analyst during the analysis of interviewed persons often consists of the team leader, as well as to allow project leaders and team members to one or two developers and other important stakeholder or analyze the FLOW diagram, we want to support the analysis contributor to the team’s work. Each interview starts with a with quantitative measures and visualizations. In order to short introduction of the interviewee to give an overview of the implement our metaphor of a pipe system, we have to make main tasks. For each task, the interviewee names the involved some assumptions which we present in this section. persons and the required information (input), the working products (output), the supporting artifacts such as templates, A. Weighting Information Flow standards or tools [13]. In a first step, we define the diameter of the pipes which are The results of the FLOW interviews are visualized in a given by weights for the edges. The weights represent the FLOW diagram. Figure 1 presents the main information stores amount of information flowing between two nodes [23]. A (faces for fluid stores and documents for solid stores). The small weight defines a small flow capacity whereas a large rectangle represents an activity summarizing parts of the dia- weight defines a large flow capacity. gram that are too fine-grained for the visualization or that are Considering a FLOW diagram, we have to distinguish not further defined during the interviews. Note that the FLOW between four cases without activities and four cases with diagram is a directed network since information mainly flows from one person to another, even if the interaction (mostly 1 https://trello.com/ activities: Each combination of fluid and solid stores (solid communicate” [24]. However, it is also possible to reduce the → solid, solid → fluid, fluid → solid, fluid → fluid) and activities as described in the following. combinations with an activity (fluid → activity, solid → activity, activity → fluid, activity → solid). We define the B. FLOW Diagrams as Pipe Systems amount of information flow between two information stores We aim at applying the Edmond-Karp-Algorithm of the Ford- according to their states as represented in Table I. We define Fulkerson-Method for calculating the maximum flow within the edge weights representing the amount of information a network to the FLOW diagram. In order to apply this flow between two nodes to range between 0 and 1, where algorithm, we first have to adjust the FLOW diagram, since 0 represents no information exchange. the algorithm only considers nodes and edges, but no activities as in FLOW. As presented by Kiesling et al. [13], there are Table I: Edge weights representing the amount of information mainly three possibilities of transforming a FLOW diagram flow between two nodes in developer networks into a network: Source Receiver Weight 1) Connect all incoming and outgoing stores of the activity. Fluid Fluid 0.7 2) Represent the activity with a separate node and connect Fluid Solid 0.5 all incoming and outgoing nodes with this one by pre- Solid Fluid 0.4 serving the direction of flow. Solid Solid 0.2 3) Decide for each incoming node whether it should be directly connected to an outgoing one or to a separately The weights in Table I require some assumptions: defined one. This case requires deeper insights into the • fluid → fluid: Transporting information directly and per- activity. sonally enables a good flow of information because the receiver is able to directly ask questions in order to There are some prerequisites deciding which case is most understand what the source tells him. suitable. • fluid → solid: Information sharing from a fluid store to • Case 1 implies that there is an information flow from a solid one like writing something down, is worse than each incoming to each outgoing node involved in the information sharing between persons, since the source activity. For example, in case of a meeting, this might cannot write everything down or may forget implicit be correct. However, activities often represent a further knowledge. Hence, the receiver has less information than communication network where the information does not the source. necessarily flow from each incoming to each outgoing • solid → fluid: Reading is one example for transform- node. ing information from a solid store to a fluid one. The • Case 2 is the most neutral way of integrating an activity amount of information received by reading the document within a network. After having analyzed the network, it strongly depends on the receiver’s knowledge. Missing or might be possible to re-transfer the network into a FLOW not understandable information may not be retrieved by diagram and adapt the results for analyzing the diagram. reading the document twice. The only way of receiving In case of doubts, we recommend using this possibility. missing specific information is talking to the source of • Case 3 is the most exact way of dissolving an activity. the document. This case requires deep insights into the activity that • solid → solid: Information sharing from a solid store cannot always be achieved. to another solid store is given by copying documents Depending on the choice of the case, there are different or changing the file format. However, since there is no interpretations of the resulting network. In order to connect human involved in the process, we assume a very low both approaches, i.e. the FLOW analysis and the algorithmic “collaboration” between these documents. extension, it is required not to change the core statements of These assumptions imply the decreasing weights in Table I. In the network. Hence, the transformation method needs to be the case of information flows with involved activities, we have carefully selected for each activity. to make different assumptions. We know either the source or After this step, we receive a FLOW network only consisting the receiver, but one of them is an activity, which is, first of of information stores and information flows, i.e. nodes and all, a black box. Hence, we do not know the granularity of edges. We are now able to apply the algorithm to the network. the information flow, i.e. if it is solid or fluid. Sometimes, knowledge about the activity helps to decide whether an C. Quantitative Measures information flow is solid or fluid, but in many cases, this Based on the pipe system and the use of the Edmond-Karp- decision is not possible. Algorithm, we calculate the maximum flow between two We consider information flows within activities as the worst nodes. Comparing the results between different nodes and case and choose the minimum edge weight which is given considering the paths of the maximum flow helps to detect by the information flow between two solid stores, i.e. 0.2, in central persons, i.e. persons who are very important for the case of doubts. We assume that information flows during each information flow. A person can be central from different activity, since according to Watzlawick et al. “one cannot not viewpoints. Social network analysis is a wide field providing Table II: Overview of centrality measures and their importance for information flow analysis [13] Centrality Measure Influenced by Persons with a high degree... Degree Centr. incoming resp. outgoing edges receive resp. share a lot of information Closeness Centr. the average distance to each other node obtain novel information early Betweenness Centr. the location on the shortest paths in the network need to share many urgent information Flow Betweenness Centr. the location on all paths in the network coordinate the information flow Eigenvector Centr. the number of neighbors who are also central can share important information in a very short time with the whole network different measures for centrality [23]. Kiesling et al. [13] apply V. C ASE S TUDY IN I NDUSTRY commonly used centrality measures to the nodes of a FLOW diagram. Table II summarizes some centrality measures and We present a preliminary study with a globally distributed explains their relevance for information flow analysis. All software engineering company [22] to demonstrate the appli- of these centrality measures are local ones, i.e. they can be cability of our approach. calculated for a single node. These measures help to detect critical issues in the network [13]. Some of them are obvious in the FLOW diagram, e.g. A. FLOW Analysis the degree centrality. One only needs to count the incoming According to the proceeding presented in Sec. III-B, we have and outgoing edges of a node and the nodes that have a lot of interviewed the director of the company. In this interview, them are central in the sense of degree centrality. The persons we gained profound knowledge about the information sharing represented by the nodes seem to have a certain importance behavior. The FLOW diagram after the first interview is rather for the information flow process. However, other measures small and clear. In order to present the proceeding of our such as the closeness centrality are difficult to identify in a quantitative analysis, it is sufficient and better to use a rather network without having calculated the measures. We provide small, but clear FLOW diagram. a visualization for the closeness centrality which measures the In the first interview, we collected data about the overall average distance of a node to all other nodes in the network corporate structure, the hierarchies and the communication in order to facilitate understanding of persons who are central in the sense of closeness centrality. We call this visualization behavior within and across the teams. We talked to the director network expansion since we highlight all nodes that can be who has a good overview of all processes. The whole interview took about two hours. It started with some demographics about reached by a certain node after 1, 2, 3, or more steps. The the director, his experiences and his background. Afterwards, more nodes a node can reach within a few steps, the higher he started describing the overall process. We took notes on the closeness centrality. It remains future work to provide the main activities and asked him about the results, incoming visualizations for the other centrality measures that are not easy to determine. information, supporting tools and persons, and controlling elements such as templates. In the end, we summarized the D. Analyzing the resulting network results of the interview, i.e. the information on the main We implemented a prototype depicting the transformed FLOW activities during the process to avoid misunderstandings. diagram and then calculating the maximum information flow We visualized the findings based on the interviews as a from a source to a target. Figure 4 represents a screenshot of FLOW diagram. The result of the interview is presented the tool. All fluid information stores are visualized as circles in Figure 2 [22]. The developers work in Spain, while the and all solid stores are rectangles. Currently, activities are customer and the consultants are in Germany. To coordinate represented as squares but have the same properties as the the communication across borders, each team has a team other nodes. The distinction between the information stores leader who exchanges information with the other team leader. only supports the intuitive comprehension of the network; the A project starts with a workshop (see (1) in Figure 2). The algorithm does not differ between the type of information director, the customer and both team leader participate in the because this information is already contained in the pipe workshop. It helps to clarify basic conditions. The workshop diameter. In Figure 4, the purple node in the upper left corner results in a requirements catalog (2) documenting all require- is the source and the orange node in the lower right-hand ments. Based on this requirements catalog, the consultants corner is the target, i.e. the receiver of the node. The blue path write story cards for the developers in Spain and create a visualizes the best ways for information sharing according to concept. A click-dummy (3) basically visualizes the idea for the Edmond-Karp-Algorithm. the final product. The developers implement the story cards The tool also includes the visualization of the network and regularly exchange information with the consultants (4) expansion (see Figure 3) and the calculation of the centrality in Germany, who are in contact with the customer. In the end, measures in Table II. To facilitate the interpretation of the the customer receives the remaining software product to test centrality measures, our tool is able to highlight the extreme it and to express change requests. This step is not visualized values, i.e. those differing from the other ones. in Figure 2. which is rather decentral. This node needs 4 steps to cover the whole network. The main reason for this finding is that the quality assurance only talks directly to the developer which can also be found in the network. VI. D ISCUSSION In the following, we reflect on the limitations of our approach before interpreting and discussing our findings from the ex- ample above. A. Critical Appraisal Due to the basically conceptional approach in this paper, the presented idea has two deficiencies, which are not unimportant for the interpretation and the reliability of the results after the application of our approach: (1) the choice of the algorithm Figure 2: FLOW diagram after the first interview [22] and (2) the weightings of the information flow. (1) The choice of the algorithm: There exist many different algorithms for calculating the maximum flow in a network. B. Tool-Supported Analysis Hence, there might be a better algorithm for our approach. We consider exemplary the information flow from the team We decided to use the Ford-Fulkerson-method which is widely lead (purple node) to the developers (orange node) in Figure 4. distributed in graph theory and which is even recommended by The workshop, the requirements catalog and the click-dummy Cormen [18] for considering information flows in a network. lay on the path of the maximum flow from the team lead to (2) The weightings of the information flows: The proposed the developers. The most important node are the consultants weightings in Table I underly some assumptions presented in because they lay on each of the paths. If this node is missing, subsection IV-A. For calculations, we had to postulate concrete information would flow worse or even not at all from the values. Although the assigned values seem arbitrary, they are team leader of the consultants to the developers. The centrality chosen carefully based on the assumptions presented in sub- measures of this node (closeness: 0.08, betweenness: 0.84, section IV-A. But slightly different values with a comparable in-degree: 7, out-degree: 12) also support this fact. All of scaling do not lead to completely different results. The range them are highlighted, i.e. they differ remarkably from the from 0 to 1 is based on the calculation of FLOW distance, measures of all other nodes. The workshop (betweenness: 0.60, which defines the amount of information flow in smaller teams in-degree: 6, out-degree: 2) and the developers (betweenness: and also ranges from 0 to 1 [4]. Nonetheless, the values 0.49, in-degree: 4, out-degree: 4) are also very important. represent the current state of our research and may need to Considering the information expansion of the consultants be adjusted later due to new findings. However, the basic idea (central node) and the quality assurance (decentral node) in remains valid also with slight adjustments. Nonetheless, we Figure 3 illustrates the benefit of this measure. Figure 3a do not consider human factors influencing the information visualizes the information expansion of the consultants that flow such as a node’s absorption of information. The ability only need 2 steps to cover the whole network. Figure 3b of a person to absorb and internally handle information can presents the information expansion of the quality assurance increase or decrease the amount of information to be shared (a) Information expansion of a central node (b) Information expansion of a decentral node Figure 3: Stepwise information expansion. The colour of the starting node is red, turning towards blue with each additional step. Furthermore, the diameter of the coloured circle around the node decreases. Figure 4: Screenshot of our prototype calculating the maximum information flow between the purple (source) node and the orange (receiver) node as well as common centrality measures with other persons. Furthermore, we do neither consider the In the given example, we figured out the importance of the loss of information nor misunderstood information that are consultants within the process. A FLOW analyst recognizes falsely shared. We will consider these aspects, in particular the consultants as so-called ”competence spider” because they ”personalized” weightings, in further research. are involved in many information flows. As evident from the B. Interpretation FLOW diagram, there are many incoming and outgoing edges from and to the consultants. Our pipe system supports the Information flow in software development teams or companies relevance of the consultants. The underlying data confirm that is a very complex behavior which requires a manual analysis. the consultants are at least mandatory for a fast information Some structures or patterns in the FLOW diagram are too in- flow from the team leader to the developers. Without the con- terwoven and subliminal to be detected by a simple algorithm. sultants managing and coordinating the information sharing, We want to support the information flow analysis and an information flow from the team leader to the developers increase the developers’ awareness of the complexity of infor- cannot be guaranteed. Furthermore, the centrality measures mation sharing by presenting an approach of metaphorically and the information expansion underline the importance of considering a developer network as a pipe system. Hence, we the consultants objectively. This objectivity may also help to quantify information flow analysis to apply the algorithm. This increase the awareness of the involved persons. A qualitative helps to increase the trustworthiness of the findings due to analysis is always subjective, but the calculated measures underlying data. Furthermore, it decreases the possibility of are objective and the results easier to comprehend. However, missing interpretations and forgotten obvious things. It helps there is some analysis required to support this assumption. to make the subjective analysis more objective. The qualitative and the quantitative method supplement each Our case study reflects the advantages of the interplay of other. Nonetheless, the quantitative method needs to be refined both qualitative and quantitative information flow analysis. The and extended to increase the advantages and to evaluate the qualitative method finds complicated structures and patterns practicability. such as the “Chinese-whisper-pattern”. The quantitative analy- sis detects the same issues and – after adaptions and extensions VII. C ONCLUSION AND F UTURE W ORK – it may help to increase the developers’ awareness for the Information exchange is important in software development current state of information flow by visualizing and simulating teams. Furthermore, information needs to be shared between the results. It currently detects ways of information flow that the customer, the team, project leader and many other different are too long to be suitable or that are duplicated. However, involved persons. the analysis needs to be sharpened and extended to gain real One strategy to analyze information flow is the FLOW benefits from the quantifications. method which aims at observing, examining, and improving information flow in teams and companies. Until now, the [5] T. Niinimäki, A. Piri, and C. Lassenius, “Factors affecting audio and FLOW method is a basically subjective method and the results text-based communication media choice in global software development projects,” in Proceedings of the 4th IEEE International Conference on depend on the analyst’s experience and knowledge. Global Software Engineering. IEEE, 2009, pp. 153–162. In this paper, we presented the conceptional idea of meta- [6] D. Damian, S. Marczak, and I. Kwan, “Collaboration patterns and the impact of distance on awareness in requirements-centred social phorically considering the developer network as a pipe system. networks,” in Proceedings of the 15th IEEE International Requirements We defined the amount of information flow between two nodes Engineering Conference. IEEE, 2007, pp. 59–68. in the network based on the states of information flow provided [7] J. D. Herbsleb, “Global Software Engineering: The Future of Socio- Technical Coordination,” in Future of Software Engineering. IEEE by FLOW. Information flow can then be analyzed based Computer Society, 2007, pp. 188–198. on established algorithms in graph theory. In this approach, [8] E. Bjarnason, K. Wnuk, and B. Regnell, “Requirements are slipping we use the Ford-Fulkerson-method which is widely used to through the gaps a case study on causes & effects of communication gaps in large-scale software development,” in Proceedings of the 19th calculate the maximum flow in a flow network. IEEE International Requirements Engineering Conference. IEEE, 2011, We applied our approach to a software development com- pp. 37–46. pany which results after the first interview have already been [9] B. Bruegge, A. H. Dutoit, and T. Wolf, “Sysiphus: Enabling informal collaboration in global software development,” in Proceedings of the analyzed corresponding to the FLOW method [22]. We were International Conference on Global Software Engineering. IEEE, 2006, able to detect some findings which are rather obvious (e.g. pp. 139–148. the important role of the consultants) and have also been re- [10] K. Stapel and K. Schneider, “Managing Knowledge on Communication and Information Flow in Global Software Projects,” Expert Systems, trieved in the qualitative analysis. However, we can objectively vol. 31, no. 3, pp. 234–252, 2014. support these findings by presenting centrality measures. [11] B. S. Caldwell and N. C. Everhart, “Information flow and development In future, we will evaluate the influence of our approach of coordination in distributed supervisory control teams,” International Journal of Human-Computer Interaction, vol. 10, no. 1, pp. 51–70, 1998. on the developers, i.e. if our approach increases the awareness [12] K. Schneider, K. Stapel, and E. Knauss, “Beyond documents: visualiz- for the relevance of information sharing. Furthermore, we plan ing informal communication,” in Proceedings of the 3rd International to graphically simulate information flow in the network and Workshop on Requirements Engineering Visualization. IEEE, 2008, pp. 31–40. include further metrics to measure the quality of information [13] S. Kiesling, J. Klünder, D. Fischer, K. Schneider, and K. Fischbach, flow. Additionally, our automated approach with pipe systems “Applying social network analysis and centrality measures to improve is still based on a manual elicitation phase of the informa- information flow analysis,” in Proceedings of the 17th International Con- ference on Product-Focused Software Process Improvement. Springer tion flows. This phase consists of interviews with project International Publishing, 2016, pp. 379–386. members and drawing a flow diagram. This is highly time- [14] C. Durugbo, A. Tiwari, and J. R. Alcock, “Modelling information consuming and requires an experienced analyst. In order to flow for organisations: A review of approaches and future challenges,” International Journal of Information Management, vol. 33, no. 3, pp. enable software companies to conduct a flow analysis on their 597–610, 2013. own without any expertise, we have to automate the elicitation [15] M. Benson-Rea and S. Rawlinson, “Highly skilled and business mi- phase of the flow analysis as well. At the moment, we deal grants: Information processes and settlement outcomes,” International Migration, vol. 41, no. 2, pp. 59–79, 2003. with this problem in future and ongoing research. [16] G. Smith, On the Foundations of Quantitative Information Flow. Berlin, ACKNOWLEDGMENT Heidelberg: Springer Berlin Heidelberg, 2009, pp. 288–302. [17] G. Lowe, “Quantifying information flow,” in Proceedings of the 15th This work was supported by the German Research Foundation IEEE Computer Security Foundations Workshop. IEEE, 2002, pp. 18– (DFG) under grant number 263807701 (project TeamDynam- 31. [18] T. Cormen, Introduction to Algorithms. MIT press, 2009. ics, 2018-2020). [19] K. Stapel, E. Knauss, and K. Schneider, “Using FLOW to improve com- munication of requirements in globally distributed software projects,” in R EFERENCES Collaboration and Intercultural Issues on Requirements: Communica- [1] B. Al-Ani and H. K. Edwards, “A comparative empirical study of tion, Understanding and Softskills. IEEE, 2009, pp. 5–14. communication in distributed and collocated development teams,” in [20] J. Klünder, C. Unger-Windeler, F. Kortum, and K. Schneider, “Team Proceedings of the 3rd IEEE International Conference on Global meetings and their relevance for the software development process over Software Engineering. IEEE, 2008, pp. 35–44. time,” in Proceedings of Euromicro Conference on Software Engineering [2] T. Wolf, A. Schröter, D. Damian, and T. Nguyen, “Predicting Build and Advanced Applications, 2017. Failures Uing Social Network Analysis on Developer Communication,” [21] K. Schneider and O. Liskin, “Exploring flow distance in project com- in Proceedings of the 31st International Conference on Software Engi- munication,” in Proceedings of the 8th International Workshop on neering. IEEE Computer Society, 2009, pp. 1–11. Cooperative and Human Aspects of Software Engineering. IEEE Press, [3] J. D. Herbsleb, H. Klein, G. M. Olson, H. Brunner, J. S. Olson, and 2015, pp. 117–118. J. Harding, “Object-oriented analysis and design in software project [22] J. Klünder and K. Schneider, “Information Flow in Distributed Software teams,” Human–Computer Interaction, vol. 10, no. 2-3, pp. 249–292, Projects – A Case Study (orig.: Informationsfluss in verteilten Software- 1995. projekten - Eine Einzelfallstudie),” in PERSONALquarterly, 69(2), 2017, [4] J. Klünder, K. Schneider, F. Kortum, J. Straube, L. Handke, and pp. 10–15. S. Kauffeld, “Communication in Teams - An Expression of Social Con- [23] S. Wasserman and K. Faust, Social Network Analysis: Methods and flicts,” in Proceedings of the 6th International Conference on Human- Applications. Cambridge university press, 1994, vol. 8. Centered Software Engineering and 8th International Conference on [24] P. Watzlawick, J. B. Bavelas, and D. D. Jackson, Pragmatics of human Human Error, Safety, and System Development. Springer International communication: A study of interactional patterns, pathologies and Publishing, 2016, pp. 111–129. paradoxes. WW Norton & Company, 1967.