<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Understanding the Involvement of Developers in Missing Link Community Smell: An Exploratory Study on Apache Projects</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Toukir</forename><surname>Ahammed</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Institute of Information Technology</orgName>
								<orgName type="institution">University of Dhaka</orgName>
								<address>
									<settlement>Dhaka</settlement>
									<country key="BD">Bangladesh</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Moumita</forename><surname>Asad</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Institute of Information Technology</orgName>
								<orgName type="institution">University of Dhaka</orgName>
								<address>
									<settlement>Dhaka</settlement>
									<country key="BD">Bangladesh</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kazi</forename><surname>Sakib</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Institute of Information Technology</orgName>
								<orgName type="institution">University of Dhaka</orgName>
								<address>
									<settlement>Dhaka</settlement>
									<country key="BD">Bangladesh</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Understanding the Involvement of Developers in Missing Link Community Smell: An Exploratory Study on Apache Projects</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">C6557742B1D52AC3925762E281B05EFC</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:02+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>missing link smell</term>
					<term>community smell</term>
					<term>software engineering</term>
					<term>empirical analysis</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Missing link smell occurs when developers collaborate in source code without communication. This can affect software maintenance by the means of lacking mutual awareness, mistrust and knowledge gap. Existing studies have investigated the relationship of missing link smell with code smell and different socio-technical factors like turnover. This study aims to understand how many developers are involved with missing link smell, by calculating the percentage of smelly developers for a project. The study also investigates the relationship between the number of contributions and the number of missing link involvements of a developer. The result shows that the percentage of smelly developers involved with missing link smell is 8.7% on average. The result also suggests a moderate positive correlation between the contribution of a developer to the project and the involvement in smell.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Community smells are the organizational and social antipatterns in a development community <ref type="bibr" target="#b0">[1]</ref>. Community smells may lead to the emergence of social debt which indicates unforeseen project costs connected to a suboptimal software development community. Community smells may not be an immediate obstacle for software development but these can affect software maintenance negatively in the long run <ref type="bibr" target="#b1">[2]</ref>. Missing link is one of the common community smells. It refers to the condition when two co-committing developers show uncooperative behavior by not communicating <ref type="bibr" target="#b2">[3]</ref>.</p><p>Missing link community smell decreases communication activities in the development community. The lack of communication and cooperation negatively affects mutual awareness and trust among developers <ref type="bibr" target="#b2">[3]</ref>. A software product can be thought of as the combined effort of all developers. So, collaboration along with proper communication is necessary among developers. It is important to know how many developers are involved in missing link smell as they may affect the whole project. Identifying these developers and analyzing their characteristics is important. This will help the project managers to take steps such as task reassigning, team reformation, increasing awareness about communication etc. to keep communication issues lower among the developers in the community.</p><p>QuASoQ 2020: 8th International Workshop on Quantitative Approaches to Software Quality email: bsse0806@iit.du.ac.bd (T. Ahammed); bsse0731@iit.du.ac.bd (M. Asad); sakib@iit.du.ac.bd (K. Sakib)</p><p>The detection of missing link smell and its impact on software artifacts have been analyzed in previous studies. S. Magnoni proposed the identification pattern of missing link community smell <ref type="bibr" target="#b2">[3]</ref>. Tamburri et al. examined the relationship between community smells and different socio-technical factors, e.g., socio-technical congruence, turnover etc <ref type="bibr" target="#b3">[4]</ref>. This study considered missing link, organizational silo, black cloud and radio silence community smell. Palomba et al. investigated the impact of missing link smell and four other community smells on code smell intensity <ref type="bibr" target="#b1">[2]</ref>. <ref type="bibr">Catolino et al.</ref> analyzed the role of four community smells including missing link smell on gender diversity and women participation in opensource community <ref type="bibr" target="#b4">[5]</ref>. However, developer involvement in missing link smell and how developer contributions in the project relate to missing link smell have not been analyzed yet. In this context, the current study aims to focus on these factors by addressing the following Research Questions (RQs).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RQ1: How many developers are involved in missing link community smell?</head><p>In an open-source project, there can be many developers who contribute to the project. All the developers may not be involved in missing link community smell. This RQ aims to find how many developers are involved in missing link smells in a community. This is important to know the collective contribution of developers to the number of missing link smells in a project. This finding will help the project managers to understand the severity of communication issues among developers in the community. The action can be different to mitigate missing link smell based on the number of developers involved in smells.</p><p>8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RQ2: How does missing link smell relate with a developer contribution?</head><p>This RQ focuses on the involvement of individual developer in missing link smell. This RQ relates an important characteristic of a developer, i.e., contribution, to missing link smell. This finding will help project managers understanding which type of developers involve more in missing link smell. This information can be used to decide which developers can be monitored to control missing link smell in the community from the beginning of a project.</p><p>In this study, missing link smells are analyzed on seven open-source projects of Apache ecosystem. These projects are selected for being large enough to analyse and the availability of communication data, i.e., mailing list. First, the instances of missing link smell are detected in each project. The missing link smell is identified by finding cases where a collaboration does not have its communication counterpart. Then the developers associated with each smell are identified by extracting the instance of smell. The fraction of developers involved with missing link smell is calculated to check whether a subset of developers are involved with this type of smell. Then the correlation is investigated between the contribution of developers and their involvement in missing link smells.</p><p>The results of the study show that a small part of the total developers involved with missing link community smell. On average, 8.7% of the total developers of a project are involved with missing link smell. This study also finds a significant moderate positive correlation between the developer contribution and their involvement in missing link smell.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Background</head><p>This section provides some important terminologies to better understand the missing link community smell.</p><p>Developer Social Network (DSN): A network of a software development community where a node represents developer and relationships between developers, e.g., communication, coordination, are represented by an edge.</p><p>Collaboration Network: A specific type of DSN which indicates the collaboration in a development community. Here, a node represents a developer who contributes to the project in the version control system. Two developers are connected through an edge if they contribute to the same part of source code within a given time frame <ref type="bibr" target="#b2">[3]</ref>. Figure <ref type="figure">2</ref> represents an example of a collaboration network.</p><p>Communication Network: A specific type of DSN which indicates the communication within the defined communication channel of a development community. Here, a node represents developers who communicate Two developers are connected through an edge if they replied in the same e-mail within a given time frame <ref type="bibr" target="#b2">[3]</ref>. A communication network is illustrated in Figure <ref type="figure">3</ref>.</p><p>Missing Link Community Smell: A missing link community smell occurs when a couple of developers collaborate with each other but show uncooperative behaviors by not communicating. This smell can be identified by detecting collaboration between two developers that do not have the communication counterpart in defined communication channel, e.g., development mailing list <ref type="bibr" target="#b2">[3]</ref>.</p><p>An example of DSN is illustrated in Figure <ref type="figure" target="#fig_0">1</ref>. The upper part of the graph represents communication and the lower part represents the collaboration among developers. The developers are connected with a solid line if they communicate with each other. The developers are connected to the file icon through a dashed line if they contribute to that source code file.</p><p>The collaboration and communication network can be generated separately from this DSN. Figure <ref type="figure">2</ref> and Figure <ref type="figure">3</ref> represent the collaboration and the communication network respectively. The missing link smell can be identified comparing the collaboration network with the communication network. There is a link between developer E and F in the collaboration network (Figure <ref type="figure">2</ref>) but there is no corresponding link between these two developers in the communication network (Figure <ref type="figure">3</ref>). Developer E and F are collaborating on the same part of source code but they are not connected through any communication link. Thus, this is considered as an instance of a missing link between developer E and F.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Related Work</head><p>In recent years, community smells are studied to incorporate the organizational and social aspects of developer community in software engineering research. Some studies focused on defining different community smells that can lead to unforeseen project costs <ref type="bibr" target="#b0">[1]</ref>, <ref type="bibr" target="#b5">[6]</ref>. On the other hand, some studies investigated the impact of community 8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020) Tamburri et al. first introduced the concept of social debt in software engineering <ref type="bibr" target="#b5">[6]</ref>. Later, in an industrial case study, they improved and elaborated the definition of social debt. In the same study, they defined nine different community smells which are connected to social debt <ref type="bibr" target="#b0">[1]</ref>. They also suggested a list of possible mitigations of community smells such as learning community, cultural conveyors, stand-up voting etc., to avoid the negative effects.</p><p>Magnoni proposed the identification pattern of four out of nine community smells <ref type="bibr" target="#b2">[3]</ref> defined in <ref type="bibr" target="#b0">[1]</ref>. He developed an open-source tool CODEFACE4SMELLS<ref type="foot" target="#foot_0">1</ref> as an extension to CODEFACE <ref type="bibr" target="#b6">[7]</ref>. This tool is capable of detecting community smells from the change history in the version control system and the communication history in development mailing list.</p><p>Tamburri et al. analysed the distribution of community smells in open-source projects <ref type="bibr" target="#b3">[4]</ref>. They also assessed the relation between community smells and existing sociotechnical quality factors, e.g., socio-technical congruence, communicability, turnover etc.</p><p>Palomba et. al examined the relationship between social and technical debt <ref type="bibr" target="#b1">[2]</ref>, <ref type="bibr" target="#b7">[8]</ref>. They assessed the impact of community smells on code smells. They found community smells significantly influencing code smell intensity. They also proposed a community-aware code smell intensity model in which both technical and community related factors were considered.</p><p>Catolino et al. analysed the role of gender diversity and women participation in community smell <ref type="bibr" target="#b4">[5]</ref>. They considered four types of community smell i.e., organi-zational Silo, Lone Wolf, Black Cloud and Radio Silence. They found that gender diverse team had a lower number of community smells than non-gender diverse team. They also showed that gender diversity and women participation were important factors for Black Cloud and Radio Silence whereas organizational Silo and Lone wolf were found partially related.</p><p>The existing studies have focused on community smells and the impact of these smells on software artifacts. The phenomenon of community smells is surrounded with developers in a development community. However, developer involvement in missing link smell and the relation between missing link smell and developer contributions have not been investigated yet. So, the developers involved with community smells and how their contribution relate to missing link smell need to be explored.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Methodology</head><p>This study aims to understand how many developers of a project are involved in missing link smell. This study also wants to assess the relationship between a developer's contribution and involvement in missing link smell. First, the missing link smell is detected for all the selected projects. Then the percentage of smelly developers is retrieved for each project. Later, the correlation analysis is performed between a developer's contribution and involvement in missing link smell.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Dataset</head><p>In this work, seven large open-source projects belonging to APACHE ecosystem are selected for analysis. These projects have been chosen because they are large and the mailing lists are publicly available. Table <ref type="table" target="#tab_0">1</ref> provides the list of analysed projects with their name, source code link, development mailing list and analysis period. All projects are hosted in online version control system GitHub and the development mailing list archives are available on Gmane<ref type="foot" target="#foot_1">2</ref> .</p><p>The selected projects are large enough in terms of community members and the number of commits. The projects have 668 community members on average. All the projects have a substantial number of commits, with an average of 10359. Thus the study has enough collaboration and communication data for analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Missing Link Smell Detection</head><p>The selected projects are analysed using a six-month analysis window. The analysis period of a project starts from when both communication in mailing list and change history in repository are available. A few more months are excluded to make the analysis period divisible by six months. The analysis period for each project is given in Table <ref type="table" target="#tab_0">1</ref>. For example, Apache Cassandra project has the analysis period of 11 years starting from October 2009 to September 2020.</p><p>For every analysis window of a project, a communication network and a collaboration network is built. The communication network is generated by extracting communication data from development mailing list and the collaboration network is generated by extracting collaboration data from the project repository. After having both communication and collaboration networks, the instances of missing link smell are identified by comparing every collaboration link with communication networks. If any collaboration link does not have its communication counterpart, this link is identified as a missing link instance.</p><p>An open-source tool, CODEFACE4SMELLS <ref type="bibr" target="#b3">[4]</ref>, is used to detect missing link community smell in this study. This tool is capable of detecting missing link smell in the aforementioned way from project repository and development mailing list. The tool requires the link of source code repository and mailing list archive as input. Then the tool returns a list of missing link instances for each window of the project. A missing link instance is represented by a pair of developers. For example, (𝑎, 𝑏) represents a missing link instance between developer 𝑎 and 𝑏.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Smelly Developers Identification</head><p>A developer involved with a missing link smell is considered as a smelly developer. An instance of missing link smell consists of two collaborating developers who do not communicate with each other. Thus for every missing link smell, there are two smelly developers. CODE-FACE4SMELLS outputs a missing link instance as a pair of developers. So, the smelly developers can be obtained by extracting all missing link instances of a project. The smelly developers of a project 𝑥 can be denoted by a set 𝑆𝐷 𝑥 . The number of smelly developers of the project will be the number of elements in 𝑆𝐷 𝑥 .</p><p>To calculate the percentage of smelly developers in a project, the total number of developers of that project is required. The total number of developers is defined as the sum of the number of developers who contribute to source code and the number of members who communicate on mailing list <ref type="bibr" target="#b2">[3]</ref>. The total number of developers of a project is obtained by counting the number of members present in either collaboration or communication network generated by 𝐶𝑂𝐷𝐸𝐹 𝐴𝐶𝐸4𝑆𝑀𝐸𝐿𝐿𝑆. The percentage of smelly developers of a project is calculated using the following formula (Equation <ref type="formula" target="#formula_0">1</ref>),</p><formula xml:id="formula_0">𝑝𝑒𝑟𝑐𝑆𝐷 𝑥 = 𝑛𝑢𝑚𝑆𝐷 𝑥 𝑡𝑜𝑡𝑎𝑙𝐷𝑒𝑣 𝑥 × 100%,<label>(1)</label></formula><p>where 𝑛𝑢𝑚𝑆𝐷 𝑥 is the number of smelly developers in project 𝑥 and 𝑡𝑜𝑡𝑎𝑙𝐷𝑒𝑣 𝑥 is the number of total developers in project 𝑥.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Correlation Analysis</head><p>RQ2 aims to understand the relationship between a developer's contribution and involvement in missing link smell. To address this RQ, the correlation between following two measures is analysed:</p><p>1. how many commits a developer has in the project repository 2. how many times a developer is involved in missing link smell</p><p>In open-source projects, commits are the most representative form of coding contribution <ref type="bibr" target="#b8">[9]</ref>. So, the contribution of a developer in a project is measured by the number of commits of that developer in the project repository.</p><p>The number of commits of every individual developer is retrieved from the source code repository.</p><p>The number of involvement in missing link smells can be obtained from the list of missing link instances of a project. First, the developers are extracted from all the missing link instances of the project. Then the number of involvement is calculated counting how many times a developer occurs in the list.</p><p>Both the number of commits and the number of involvement in smells of a developer are converted into 8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2 Correlation coefficient interpretation</head><p>Correlation Coefficient (Negative) Correlation Coefficient (Positive) Interpretation -0.4 &lt; 𝜏 𝑏 ≤ 0.0 0.0 ≤ 𝜏 𝑏 &lt; 0.4 Weak -0.7 &lt; 𝜏 𝑏 ≤ -0.4 0.4 ≤ 𝜏 𝑏 &lt; 0.7 Moderate -0.9 &lt; 𝜏 𝑏 ≤ -0.7 0.7 ≤ 𝜏 𝑏 &lt; 0.9 Strong -1.0 ≤ 𝜏 𝑏 ≤ -0.9 0.9 ≤ 𝜏 𝑏 ≤ 1.0 Very Strong percentage to achieve the relative measurement. The commit percentage of a developer is calculated using Equation <ref type="formula" target="#formula_1">2</ref>.</p><formula xml:id="formula_1">𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝐶𝑜𝑚𝑚𝑖𝑡 = 𝑛𝑢𝑚𝐶𝑜𝑚𝑚𝑖𝑡 𝑖 ∑ 𝑛 𝑖=1 𝑛𝑢𝑚𝐶𝑜𝑚𝑚𝑖𝑡 𝑖 × 100%<label>(2)</label></formula><p>where 𝑛𝑢𝑚𝐶𝑜𝑚𝑚𝑖𝑡 𝑖 is the number of commits of developer 𝑖 and 𝑛 is the total number of smelly developers.</p><p>Equation 3 is used to calculate missing link smell involvement of a developer in percentage.</p><formula xml:id="formula_2">𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑀𝑖𝑠𝑠𝑖𝑛𝑔𝐿𝑖𝑛𝑘 = 𝑛𝑢𝑚𝑀𝑖𝑠𝑠𝑖𝑛𝑔𝐿𝑖𝑛𝑘 𝑖 ∑ 𝑛 𝑖=1 𝑛𝑢𝑚𝑀𝑖𝑠𝑠𝑖𝑛𝑔𝐿𝑖𝑛𝑘 𝑖 ×100%<label>(3)</label></formula><p>where 𝑛𝑢𝑚𝑀𝑖𝑠𝑠𝑖𝑛𝑔𝐿𝑖𝑛𝑘 𝑖 is the number of involvement in missing link smells of developer 𝑖 and 𝑛 is the total number of smelly developers.</p><p>Finally, the correlation analysis is performed between 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝐶𝑜𝑚𝑚𝑖𝑡 and 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑀𝑖𝑠𝑠𝑖𝑛𝑔𝐿𝑖𝑛𝑘 for each project individually. Kendall's tau-b <ref type="bibr" target="#b9">[10]</ref> is used to assess the degree of association between these two variables. Both 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝐶𝑜𝑚𝑚𝑖𝑡 and 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑀𝑖𝑠𝑠𝑖𝑛𝑔𝐿𝑖𝑛𝑘 have tied values in the dataset. As Kendall's tau-b can handle tied ranks, this is used for the correlation analysis. The correlation coefficient is considered significant if the p-value is less than 0.01. The correlation coefficient is interpreted according to Table <ref type="table">2</ref>. The correlation coefficient, 𝜏 𝑏 , indicates the strength of the correlation. 𝜏 𝑏 has a range of value from -1.0 to 1.0. As 𝜏 𝑏 closes to 0, it indicates less correlation between two variables. As 𝜏 𝑏 approaches to -1.0 or +1.0, the strength of correlation between two variables is increased. The positive value of 𝜏 𝑏 indicates a positive correlation and the negative value of 𝜏 𝑏 indicates a negative correlation between two variables.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Result Analysis</head><p>This section presents the result analysis and discussion of this study. All the missing link smells found in selected projects are analysed to answer the two research questions. Analysis and discussion for both research questions are provided as follows.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">RQ1: How many developers are involved in missing link community smell?</head><p>To answer this RQ, all missing link smells of a project are considered. For every project, the number of total developers and the number of smelly developers are calculated.</p><p>Then the percentage of smelly developers is obtained for each project. Table <ref type="table" target="#tab_2">3</ref> demonstrates the percentage of smelly developers for each project. For example, Apache Cassandra project has 1380 total developers and 205 smelly developers which is 14.9% of total developers. It is observed that on average 10.5% of total developers of a software community are involved in missing link smells. Apache Cayenne community has the highest percentage of smelly developers (21.1%). This is also the smallest community among 7 communities. Tamburri et. al. found that the number of community smell grows quadratically with the number of community members until the threshold of 200 community members <ref type="bibr" target="#b3">[4]</ref>. The occurrences of community smell tend to stabilize after this threshold. As the number of total developers in Apache Cayenne community is less than 200, the number of missing link smell has not stabilized yet. So, this project has relatively more missing link smell and consequently more smelly developers. Excluding Apache Cayenne project, the rest six projects have 8.7% smelly developers on average.</p><p>These results suggest that only a small portion of developers in an open-source software community are involved with missing link smells. They do not communicate appropriately with their co-committing or collaborative developers. Thus, they contribute to the total number of community smells in a software community.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">RQ2: How does missing link smell relate with a developer contribution?</head><p>To answer this RQ, the correlation between a developer's contribution and involvement in missing link smell is analyzed. <ref type="bibr">Kendall</ref>  Another correlation analysis is performed after combining the data from all the projects. The value of the correlation coefficient is slightly increased to 0.612 but still falls under the range of moderate positive correlation. This result is also statistically significant with a p-value less than 0.01.</p><p>These results suggest that a developer who contributes more in a project tends to have more missing link smells. This can happen because a developer, who contributes more, have to communicate more with other developers. The overload of communication may be the reason for involving in more missing link smells than others. From another point of view, a developer having more contribution to a project is likely to be more familiar and experienced with that project. As he knows most of the aspects of that project, he may take the communication with co-committers lightly while contributing. However further analysis is required to find out the causes of involving in more smells.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Threats to Validity</head><p>This section discusses the potential threats that may affect the validity of this study.</p><p>Threats to external validity: Threats to external validity concern the generalization of the obtained results. In this study, seven projects from Apache are analysed. Thus the generalisation requires more projects belonging to different systems. However, to mitigate this threat large and diverse projects are selected that have a long change history -11 years on average.</p><p>Threats to internal validity: Threats to internal validity concern the factors that can influence the result but are not accounted for. In this study, CODEFACE4SMELLS tool is used for the detection of missing link smell. The outputs of CODEFACE4SMELLS are directly incorporated in this study without checking whether there is any defect in the tool. However, the capability of this tool of identifying missing link smell was evaluated in <ref type="bibr" target="#b2">[3]</ref>. This tool is also used in other studies in detecting community smells <ref type="bibr" target="#b1">[2]</ref>, <ref type="bibr" target="#b4">[5]</ref>, <ref type="bibr" target="#b10">[11]</ref>.</p><p>Moreover, this tool relies on mailing list to detect communication among developers. But there may exist other communication channels, e.g., Skype, Facebook etc., where developers communicate with each other. The result can be changed if these communication source are considered. However, mailing list represents the main communication channel for the projects analysed in this study according to the contribution guidelines of these projects. Besides, mailing list is used as the communication source in other related studies <ref type="bibr" target="#b3">[4]</ref>, <ref type="bibr" target="#b6">[7]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion</head><p>This study explores the percentage of developers in a software development community involved in missing link smells. Furthermore, the relationship between developer contribution and involvement in missing link smell is examined. At first, missing link smells are detected for all the projects. Next, the smelly developers are identified 8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020) by extracting missing link instances. The percentage of smelly developers are calculated for every project. The number of appearances of a developer in missing link smell is counted. The contribution of a developer to a project is measured by the number of commits. Finally, correlation analysis is done between contribution and involvement in smell.</p><p>This study analyses seven open-source projects of Apache. The result shows that the number of developers involved in missing link smells is 8.7% on average. This study also founds that there is a moderate positive correlation between the number of commits of a developer and the number of involvement in missing link smells. The developers who contribute more tend to involve in more missing link smell.</p><p>In future, projects from other systems can be analysed to assess the generalization of the result. Besides, other types of community smell, e.g., organizational silo, radio silence, can be examined to find their association with developers contribution.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Developer Social Network</figDesc><graphic coords="2,312.79,84.19,183.03,102.82" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Collaboration Network</figDesc><graphic coords="3,129.96,84.19,122.02,92.77" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>List of Analysed Projects</figDesc><table><row><cell># Project Name</cell><cell>Source Code</cell><cell>Mailing List</cell><cell>Analysis Period</cell></row><row><cell cols="2">1 Apache Cassandra github.com/apache/cassandra</cell><cell>gmane.comp.db.cassandra.devel</cell><cell>Oct-2009 -Sep-2020</cell></row><row><cell>2 Apache Cayenne</cell><cell>github.com/apache/cayenne</cell><cell>gmane.comp.java.cayenne.devel</cell><cell>Nov-2007 -Aug-2020</cell></row><row><cell>3 Apache CXF</cell><cell>github.com/apache/cxf</cell><cell>gmane.comp.apache.cxf.devel</cell><cell>Nov-2010 -Sep-2020</cell></row><row><cell>4 Apache Jackrabbit</cell><cell cols="3">github.com/apache/jackrabbit gmane.comp.apache.jackrabbit.devel Dec-2005 -Sep-2020</cell></row><row><cell>5 Apache Jena</cell><cell>github.com/apache/jena</cell><cell>gmane.comp.apache.jena.devel</cell><cell>Oct-2012 -Sep-2020</cell></row><row><cell>6 Apache Mahout</cell><cell>github.com/apache/mahout</cell><cell>gmane.comp.apache.mahout.devel</cell><cell>Oct-2008 -Aug-2020</cell></row><row><cell>7 Apache Pig</cell><cell>github.com/apache/pig</cell><cell>gmane.comp.java.hadoop.pig.devel</cell><cell>Oct-2010 -Aug-2020</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 4 .</head><label>4</label><figDesc>For8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)</figDesc><table><row><cell>'s tau-b is used as a correlation technique</cell></row><row><cell>since it can handle tied values.</cell></row><row><cell>First, the correlation analysis is performed individually</cell></row><row><cell>for each development community. The Kendall's tau-b</cell></row><row><cell>coefficients and p-values are provided in</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Percentage of Smelly Developers</figDesc><table><row><cell># Project Name</cell><cell cols="5">Total Developers Smelly Developers Smelly Developers(%) Average</cell></row><row><cell>1 Apache Cassandra</cell><cell></cell><cell>1380</cell><cell>205</cell><cell>14.9%</cell></row><row><cell>2 Apache CXF</cell><cell></cell><cell>972</cell><cell>94</cell><cell>9.7%</cell></row><row><cell>3 Apache Jena 4 Apache Mahout</cell><cell></cell><cell>244 615</cell><cell>34 28</cell><cell>13.9% 4.6%</cell><cell>8.7%</cell></row><row><cell>5 Apache Pig</cell><cell></cell><cell>668</cell><cell>22</cell><cell>6.0%</cell></row><row><cell>6 Apache Jackrabbit</cell><cell></cell><cell>927</cell><cell>28</cell><cell>3.0%</cell></row><row><cell>7 Apache Cayenne</cell><cell></cell><cell>175</cell><cell>37</cell><cell>21.1%</cell></row><row><cell>Average</cell><cell></cell><cell>668</cell><cell>64</cell><cell>10.5%</cell></row><row><cell>Table 4</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Correlation Analysis</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell># Project Name</cell><cell cols="2">Tau-b p-value</cell><cell></cell><cell></cell></row><row><cell>1 Apache Cassandra</cell><cell>0.508</cell><cell>&lt; 0.01</cell><cell></cell><cell></cell></row><row><cell>2 Apache Cayenne</cell><cell>0.543</cell><cell>&lt; 0.01</cell><cell></cell><cell></cell></row><row><cell>3 Apache CXF</cell><cell>0.528</cell><cell>&lt; 0.01</cell><cell></cell><cell></cell></row><row><cell>4 Apache Jackrabbit</cell><cell>0.589</cell><cell>&lt; 0.01</cell><cell></cell><cell></cell></row><row><cell>5 Apache Jena</cell><cell>0.452</cell><cell>&lt; 0.01</cell><cell></cell><cell></cell></row><row><cell>6 Apache Mahout</cell><cell>0.409</cell><cell>&lt; 0.01</cell><cell></cell><cell></cell></row><row><cell>7 Apache Pig</cell><cell>0.513</cell><cell>&lt; 0.01</cell><cell></cell><cell></cell></row><row><cell>Overall</cell><cell>0.612</cell><cell>&lt; 0.01</cell><cell></cell><cell></cell></row><row><cell cols="3">example, the correlation coefficient for Apache Cassan-</cell><cell></cell><cell></cell></row><row><cell cols="3">dra project is 0.508 and it represents a moderate positive</cell><cell></cell><cell></cell></row><row><cell cols="3">correlation. The value of correlation coefficient is sig-</cell><cell></cell><cell></cell></row><row><cell cols="3">nificant with a p-value less than 0.01. All seven projects</cell><cell></cell><cell></cell></row><row><cell cols="3">of this study show a moderate positive correlation be-</cell><cell></cell><cell></cell></row><row><cell cols="3">tween number of commits and number of smells which</cell><cell></cell><cell></cell></row><row><cell cols="2">is statistically significant with p&lt;0.01.</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://github.com/maelstromdat/CodeFace4Smells</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://gmane.io 8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>The virtual machine facility used in this research is provided by Bangladesh Research and Education Network (BdREN).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Social debt in software engineering: insights from industry</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Tamburri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kruchten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lago</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Van</surname></persName>
		</author>
		<author>
			<persName><surname>Vliet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Internet Services and Applications</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page">10</biblScope>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Beyond technical aspects: How do community smells influence the intensity of code smells?</title>
		<author>
			<persName><forename type="first">F</forename><surname>Palomba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A A</forename><surname>Tamburri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">A</forename><surname>Fontana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Oliveto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zaidman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Serebrenik</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on software engineering</title>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Magnoni</surname></persName>
		</author>
		<title level="m">An approach to measure community smells in software development communities</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Exploring community smells in open-source: An automated approach</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Tamburri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Palomba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kazman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on software Engineering</title>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Gender diversity and women in software teams: How do they affect community smells?</title>
		<author>
			<persName><forename type="first">G</forename><surname>Catolino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Palomba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Tamburri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Serebrenik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ferrucci</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Society</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="page" from="11" to="20" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">What is social debt in software engineering?</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Tamburri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kruchten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lago</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Van Vliet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="93" to="96" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">From developer networks to verified communities: a fine-grained approach</title>
		<author>
			<persName><forename type="first">M</forename><surname>Joblin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Mauerer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Apel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Siegmund</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Riehle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE/ACM 37th IEEE International Conference on Software Engineering</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2015">2015. 2015</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="563" to="573" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Poster: How do community smells influence code smells?</title>
		<author>
			<persName><forename type="first">F</forename><surname>Palomba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Tamburri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Serebrenik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zaidman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">A</forename><surname>Fontana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Oliveto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE/ACM 40th International Conference on Software Engineering: Companion, IEEE</title>
				<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="240" to="241" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">The effects of diversity in global, distributed collectives: A study of open source project success</title>
		<author>
			<persName><forename type="first">S</forename><surname>Daniel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">J</forename><surname>Stewart</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Systems Research</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="312" to="333" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Rank correlation methods</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">G</forename><surname>Kendall</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1948">1948</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><surname>Giarola</surname></persName>
		</author>
		<title level="m">Detecting code and community smells in open-source: an automated approach</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m">International Workshop on Quantitative Approaches to Software Quality</title>
				<meeting><address><addrLine>QuASoQ</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
