4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) Local Variables with Compound Names and Comments as Signs of Fault-Prone Java Methods Hirohisa Aman Sousuke Amasaki Minoru Kawahara Tomoyuki Yokogawa Center for Information Technology Center for Information Technology Ehime University Faculty of Computer Science Ehime University Matsuyama, Ehime 790-8577, Japan and Systems Engineering Matsuyama, Ehime 790-8577, Japan Email: aman@ehime-u.ac.jp Okayama Prefectural University Soja, Okayama 719–1197, Japan Abstract—This paper focuses on two types of artifacts—local ries stored in their code repositories such as the number of bug- variables and comments in a method (function). Both of them are fix commitments which have been made by a certain point in usually used at the programer’s discretion. Thus, naming local time [4], [5], [6]. However, the impact of human factors would variables and commenting code can vary among individuals, and such an individual difference may cause a dispersion in quality. also be significant since programming activities are usually This paper conducts an empirical analysis on the fault-proneness done by human beings. Different programmers would probably of Java methods which are collected from nine popular open develop different programs for the same specification. Such source products. The results report the following three findings: a difference among individuals must have a certain level of (1) Methods having local variables with compound names are influence on the quality of products, i.e., it must cause a more likely to be faulty than the others; (2) Methods having local variables with simple and short names are unlikely to be dispersion in quality. Therefore, we focus on the following faulty, but their positive effects tend to be decayed as their scopes two artifacts which may vary from person to person, (1) local get wider; (3) The presence of comments within a method body variables declared in a method (function) and (2) comments can also be useful sign of fault-prone method. written inside the method body. While these artifacts have no impact on the structure of a program, they seem to be related I. I NTRODUCTION to the understandability and the readability of the program, Software systems have been utilized in many aspects of our so they can be expected to play important roles in predicting daily life, and management of software quality has been the fault-prone methods. In this paper, we quantitatively analyze most significant activity for ensuring the safety and security the relationships of these artifacts with the fault-proneness. of the people. In fact, it is hard to always make a one-shot The key contribution of this paper is to provide the follow- release of a perfect software product which has no need to be ing findings derived from the results of our empirical analysis enhanced or modified in the future; software systems usually with nine popular open source software (OSS) products: require upgrades after their releases in order to fix their faults and/or to enrich their functionality. Needless to say, it is better • Local variables with descriptive compound names (for ex- to reduce both the frequency of their upgrades and the size of ample, “countOfSatisfactoryRecords”) can be their patches to be applied. signs that the methods are fault-prone. • Methods having local variables with simple and short To minimize upgrades of software products, thorough re- view and testing before their releases are desirable activities. names (for example, “c” or “cnt”) are unlikely to be In general, software review and testing help to detect concealed faulty, but their positive effects tend to be decayed as faults or identify suspicious software modules which are fault- their scopes get wider. • Comments within a method body also seem to be related prone [1], [2]. Then, those problems can be resolved by fixing faults or refactoring problematic programs in order to reduce to the fault-proneness of the method. the risk of causing unwanted upgrades after their releases. The remainder of this paper is organized as follows. Section While review and testing are useful activities, they are also II describes two types of artifacts which may vary among costly ones, so there have been many studies using software programmers—(1) names of local variables and (2) comments metrics to predict fault-prone modules prior to software review written inside a method body—and their relationships with the and testing activities [3]. By predicting fault-prone parts of a quality of source programs, and gives our research questions software product, cost-effective review and testing would be in regard to impacts of those artifacts. Section III reports performed, i.e., we would detect more faults at less cost. on an empirical analysis on our research questions using Most studied methods and models for predicting fault-prone popular OSS products, and discusses the results. Section IV modules have been based on structural features of products briefly describes related work. Finally, Section V presents the such as their sizes and complexities, or on development histo- conclusion of this paper and our future work. 4 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) II. L OCAL VARIABLES AND C OMMENTS variable’s name and (2) the scope of local variable. Focusing This paper focuses on local variables and comments, since on not only the length of local variable’s name but also those they may vary widely from person to person and cause a two aspects would be more worthy in analyzing the impact variation in quality. This section describes concerns of local of local variable’s name and in enhancing the quality of code. variable names and comments in regard to source code quality, This is a key motivation of this work. and set up our research questions. B. Comments A. Local Variable Name Comments are documents embedded in a source file, which Since local variables are valid only within a function or usually provide beneficial information in regard to the program a method, names of local variables are usually not spec- [12]. While there are several types of comments, we focus ified in their software specifications or design documents. on comments written inside a method (function) body in Therefore, naming local variables can be at the programmer’s this paper. Those comments usually give explanations or discretion. In general, different programmers would prefer programmer’s memos for their implementation in the method. different names for local variables even if they implement Of course, the other types of comments also provide important the same algorithm in their function or method. For example, information regarding the program. However, such comments a programmer likes to use “count” as the name of local written outside a method body are often the copyright des- variable for storing the number of records which satisfy a ignation or the programmer’s manual explaining how to use certain condition, but another programmer prefers “c” as its the method, i.e., those comments may not be decided at the name; there might even be a programmer who wants to give discretion of the programmer. Thus, those comments outside “countOfSatisfactoryRecords” to the variable. a method body may be out of our research scope focusing Needless to say, local variables with fully-spelled names on the individual difference among programmers. That is the such as “count” or ones with descriptive compound names reason why we will focus only on the comments written inside “countOfSatisfactoryRecords” make it easy to un- a method body. derstand the roles of those variables in their function or While comments along with executable code can be a great method since those names provide more information about help in understanding the code, there have also been criticisms those variables than shorter and/or simpler names. Lawrie et on their effects: comments might be written to compensate al. [7] surveyed the understandability of identifiers (including for a lack of readability in complicated programs [13]. In this not only local variables’ names but also functions’ names) context, Fowler [14] pointed out that well-written comments used in programs by comparing three types of names, (1) may be “deodorant” for masking “code smells.” Although fully-spelled names such as “count,” (2) abbreviated names comments themselves are good artifacts, they may be used such as “cnt” and (3) names using only an initial letter for neutralizing a “bad-smelling” code. Kernighan and Pike such as “c.” They reported that a longer name is easier to [10] said that programmers should not add detailed comments understand for programmers, but there is not a significant to a bad code; in such a case, it is better to rewrite their difference in comprehensibility between fully-spelled names code rather than adding comments. If a programmer wants to and abbreviated ones in their survey results. That is to say, it add detailed comments to their code during their programming is not always necessary to give a long and descriptive name to activity, the programmer may consider that the program is a local variable, and a short and simple name may be sufficient. hard to understand for others without those comments. That There are also programming heuristics on naming local is to say, comments may be signs of complicated programs. variables. Both the GNU coding standards [8] and the Java Aman et al. [11], [15] reported supporting empirical results coding convention [9] have said that names of local variables that commented programs tend to be more fault-prone than should be shorter. Moreover, Kernighan and Pike [10] also non-commented ones. In this paper, we conduct a further anal- argued that shorter names are sufficient for local variables; ysis examining combinations of (1) the composition of local for example, they considered that name “n” looks good for variable’s name, (2) local variable’s scope and (3) comments, a local variable storing “the number of points” while name in terms of fault-proneness. “numberOfPoints” seems to be overdone. Thus, long and descriptive names have not been recommended for the names C. Research Questions of local variables. However, the impact of such a descriptive As mentioned above, both the local variables and the name on the code quality has not been clearly discussed in comments are not only artifacts which may vary among those heuristics. programmers, but also remarkable ones which are expected to Aman et al. [11] conducted an empirical work and showed have relationships with the quality of the code. However, the that methods having local variables with long names are analyses in the previous work [11], [15] missed considerations more likely to be fault-prone and change-prone than the other for the composition of local variable’s name and the scope of methods. That is to say, they showed a relationship between a local variable. We will conduct a further analysis by focusing long name of a local variable and a poor quality of the code on those missed aspects as well. In order to clarify our points in a statistical manner. However, their analysis missed taking of view in our empirical analysis, we set up the following two into account the following two aspects: (1) the composition of research questions (RQs): 5 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) TABLE I S URVEYED OSS PRODUCTS . Product Size #Methods Having Data Collection Period Domain (KLOC) a Local Variable IP-Scanner 16 433 2006-07-19 — 2016-04-04 Networking Checkstyle 21 738 2003-05-05 — 2016-03-28 Code analysis eXo 21 675 2007-03-17 — 2016-04-06 Social collaboration software FreeMind 71 2, 353 2011-02-06 — 2016-03-30 Mind-mapping tool ARM 282 1, 300 2013-09-11 — 2016-03-14 Development support Hibernate 387 6, 372 2007-06-29 — 2016-03-31 Object/Relational mapping ProjectLibre 224 1, 466 2012-08-22 — 2016-04-06 MS Project clone PMD 75 738 2002-06-21 — 2016-04-05 Source code analyzer SQuirreL 405 6, 060 2001-06-01 — 2016-04-05 Database client Total 1, 502 20, 135 RQ1 Can local variables with compound names be signs (7) PMD7 , (8) ProjectLibre8 and (9) SQuirreL SQL Client of fault-prone methods? (SQuirreL)9 . All of them are ranked in the top 50 popular Java RQ2 How does a local variable’s scope relate to the effect products at SourceForge.net10 , and their source files have been of local variable’s name on the fault-proneness in a maintained with the Git. The restrictions of the development method? language and the version control system are from our data We will check the above two questions while considering the collection tools11 . impact of comments as well. 2 B. Procedure of Data Collection As mentioned in Section II-A, there have been concerns in giving descriptive names to local variables. Compound names We collected data from each OSS project in the following such as “numberOfPoints” are typical descriptive names. procedure. RQ1 asks whether a local variable with such a compound name (1) Make a clone of the repository, and make the list of all can be a sign to find fault-prone method or not. methods included in the current version. If a local variable is declared with a narrow scope, it does (2) Get the change history of each method: not seem to need a descriptive name since its influence is We check the source lines which had been changed limited within a narrow range. RQ2 focuses on the relationship through each commitment on the repository, and decide of local variables’ names with their scopes. which methods were modified at that time (see Fig.1). In examining these RQs, this paper expects to find yet The decision is made by the following three steps. another useful clue of fault-prone methods by focusing on (2a) Get both the older version and the newer version their local variable names. of the source file which had upgraded through the III. E MPIRICAL A NALYSIS commitment. This section conducts an empirical analysis in which we col- lect quantitative data from popular OSS products and analyzes that data in order to discuss the above research questions. A. Aim and Dataset The aim of this analysis is to quantitatively examine the fault-proneness of Java methods by focusing on the names of local variables, the scopes of them and the presence of comments. The results of this analysis are expected to present useful points to be checked during code review activities. We collected data from nine popular OSS products of differ- ent size and domain, shown in Table I—(1) Angry IP Scanner (IP-Scanner)1 , (2) Eclipse Checkstyle Plug-in (Checkstyle)2 , (3) eXo Platform (eXo)3 , (4) FreeMind4 , (5) GNU ARM Eclipse Plug-ins (ARM)5 , (6) Hibernate ORM (Hibernate)6 , Fig. 1. Change histories of methods included in a source file. 1 http://angryip.org/ 2 http://eclipse-cs.sourceforge.net/ 7 https://pmd.github.io/ 3 http://exoplatform.com/ 8 http://www.projectlibre.org/ 4 http://freemind.sourceforge.net/wiki/index.php/Main Page 9 http://www.squirrelsql.org/ 5 http://gnuarmeclipse.livius.net/blog/ 10 http://sourceforge.net/ 6 http://hibernate.org/ 11 http://se.cite.ehime-u.ac.jp/tool/ 6 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) (2b) Compare those two consecutive versions, and find commit 0d005dc6573dcc12df03917ee974a0736b4d5cfd ............. different parts between them. Then, obtain corre- Bug #1236 Shortcut for comment/uncomment current line sponding line numbers in the newer version. (ctrl + "/") does not Fixed according to the suggestion in the bug #1236 (2c) Decide which method(s) had been upgraded, by Please note: The orginal comment/uncomment hot key of checking the line numbers of upgraded lines against SQuirreL is ctrl+Num each method’s position (range) in the newer version. Fig. 3. An example of actual commitment message. By iterating these steps for all commitments, we get the change history of each method. (3) Collect the data on representative local variables’ names (1) Perform a random sampling of methods, which have a and scopes, and comments for each method: local variable, from all projects: We survey names of local variables declared in the initial In order to avoid an impact of project’s size bias on our version of a method (see Fig.1) and the scopes of those empirical results, we randomly sample the same number variables. We define the length of a local variable’s scope of methods from each project. to be the number of lines where the variable is valid except (2) Divide the set of methods into subsets according to the for the line of its declaration. For example, the length of representative local variable’s name. scope of variable “len” shown in Fig.2 is 5 and that of We consider “a local variable with a short name” to be variable “str” is 2, respectively. one such that the length of its name is less than or equal When there are two or more variables in a method, we to the 25 percentile in the distribution of length of name. focus on the variable whose scope is widest in the method We also take into account if the name is compound one as the “representative local variable” in order to connect or not for RQ1. Thus, we consider the following three the features of the local variable to the method. In the categories. example shown in Fig.2, the “representative local variable” • V1 : the set of methods such that the name of repre- of method “foo” is variable “len.” If there are two or sentative local variable is short and not compound. more local variables with the widest scope in a method, • V2 : the set of methods such that the name of represen- we will adopt the variable with the longer name (having tative local variable is not short and not compound. more characters) as the representative variable. Needless • V3 : the set of methods such that the name of repre- to say, if there is only one local variable in a method, the sentative local variable is a compound one. variable is the representative local variable of the method. We decide that a variable has a compound name if it is On the other hand, any methods having no local variable composed in camel case such as “numberOfItems.” are excluded from the data of interest in this work. That is to say, we consider a name to be compound one if We collect the lines of comments written inside a method it has a lower case letter followed by an upper case letter. body as well. We regard such a pair of lower case letter and upper case (4) Check if a bug fix has occurred for each method: letter as a splitting position of the name. For example, We examine the change history of each method obtained there are two splitting positions in “numberOfItems,” above and check if a bug fix has occurred or not at i.e., the pair of “r” and “O,” and the pair of “f” and the method’s upgrade. We decide whether a code change “I,” so the name can be split into three portions (words) was intended to a bug fixing or not, by checking their “number,” “Of” and “Items.” We consider that such commitment message [16]. For example, Fig. 3 shows a compounded names cannot be short ones composed by part of commitment message (obtained by using git log at most a few characters. Thus, we do not divide the command) on the repository of SquirreL SQL Client, set of methods having representative local variables with which seems to be a bug fixing commitment. Since method compound names, and define V3 only (not V3 and V4 ). “_init” in “AliasEditController.java” was (3) Divide the subsets of methods obtained at Step (2) into modified through the commitment, we consider that a bug two, according to the presence of comments: fixing was performed at the method. In order to analyze the impact of comments as well, we C. Procedure of Data Analysis divide the set of methods into two subsets by checking if there are comments12 inside method bodies or not. We conducted our data analysis in the following procedure. • C0 : the set of methods having no comment. • C1 : the set of methods having comments. String foo (String arg) { Then, we define Mij = Ci ∩ Vj for i = 0, 1 and j = int len = arg.length(); 1, 2, 3. For example, M01 is the set of non-commented if (len < 5) { return new String(arg); methods in which the representative local variable has a } short and non-compound name. Table II summarizes these String str = arg.substring(0, 5); return str + "..."; categories (the method sets) Mij . } 12 We excluded the comment out cases from our data by using a checking Fig. 2. An example of method having local variables. algorithm [17]. 7 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) TABLE II TABLE IV S YMBOLS REPRESENTING CATEGORIES . D ISTRIBUTION OF SCOPE OF REPRESENTATIVE LOCAL VARIABLES . Name of representative local variable Min. Q1 Median Q3 Max. Symbol Non-compound 0 4 9 19 793 short not short Compound (Q1 : 25 percentile; Q3 : 75 percentile) Non-commented methods M01 M02 M03 Commented methods M11 M12 M13 private void doConnectToRunningChanged() { if (doStartGdbServer.getSelection()) { boolean enabled = doConnectToRunning.getSelection(); (4) Examine the fault-proneness of methods by the above } } categories Mij : We statistically compare the bug fix rates among cate- Fig. 4. An instance of local variable whose scope is zero (“enabled”). gories Mij (for i = 0, 1 and j = 1, 2, 3) and discuss the results. (5) Examine the trends of the bug fix rates over scope: method, the majority of them are around a few to ten lines of In order to analyze the impact of variable’s scope as well, code. In order to filter out extreme data which may be noise we analyze the changes in bug fix rate by varying the in our analysis, we will use only the data whose scopes are in range (the length of scope) which we focus on. In the between 25 percentile (Q1 = 4) and 75 percentile (Q3 = 19) concrete, we compare the moving averages of the bug fix of their distribution. By this data filtering, the number of our rates among categories Mij (for i = 0, 1 and j = 1, 2, 3), samples are reduced to 1, 872. Table V gives the number of by varying the range of focusing scope. methods belong to each category Mij (for i = 0, 1; j = 1, 2, 3) after this filtering. D. Results and Discussion: Collected Data Table VI shows the distribution of the number of bug fixes which had occurred in methods over their upgrades. About We first show the results of our data collection. Since the 18% of methods seemed to have had a hidden fault and have minimum number of methods included in a project was 433 as fixed through their code changes. Since we already filtered shown in Table I (project “IP-Scanner”), we randomly sampled out the methods such that the scope of the representative 400 methods from each project, so our dataset consists of local variable was wide, most of the methods in our dataset 3, 600 methods in total. were small-sized and thus possibly more simple in structure. Table III shows the distributions of length of representative Hence, conventional size metrics and structural complexity local variables’ names in character count and in word count, metrics would be ineffective for analyzing the fault-proneness respectively. Here, “word count” means the number of words of methods in detail. It would be worth it to focus on a feature composing a variable’s name which is split according to the of methods other than the size and complexity. A local variable notion of the camel case. The longest names in character count name might be yet another useful feature to be focused on. were “containsSuppressWarningsHolderModule” and “organizationInitializersHomePathNode” E. Results and Discussion: Comparison of Bug Fix Rates by which consist of 36 characters, and the longest name in Category word count was “thereWereNodesToBeFolded” which Table VII presents the bug fix rate in each category Mij consists of 6 words. Although such some long and descriptive (for i = 0, 1; j = 1, 2, 3). There seem to be differences in the names appear in some methods, most local variables have names that consist of at most a few characters and they are non-compound names whose word count is one. Since the TABLE V 25 percentile (Q1 ) of the character count is four as shown in N UMBER OF METHODS BELONG TO EACH CATEGORY. Table III, we will consider a name whose length is less than Non-Compound Compound or equal to four letters to be short in the following analysis. Category ≤4 >4 Name Total Table IV presents the distribution of length of a represen- Non-Commented 401 527 427 1, 355 (M01 ) (M02 ) (M03 ) (C0 ) tative local variable’s scope. Since there were some methods Commented 139 164 214 517 as shown in Fig.4, where the minimum length of the scope is (M11 ) (M12 ) (M13 ) (C1 ) zero. As all local variables are valid only within a (part of the) Total 540 691 641 1, 872 (V1 ) (V2 ) (V3 ) TABLE III D ISTRIBUTION OF LENGTH OF TABLE VI REPRESENTATIVE LOCAL VARIABLE NAMES . D ISTRIBUTION OF NUMBER OF BUG FIXES OBSERVED IN METHODS AND BUG FIX RATE . Unit Min. Q1 Median Q3 Max. Character 1 4 6 10 36 Min. Q1 Median Q3 Max. Rate Word 1 1 1 2 6 0 0 0 0 5 18.1% (Q1 : 25 percentile; Q3 : 75 percentile) (Q1 : 25 percentile; Q3 : 75 percentile) 8 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) bug fix rates among categories. The minimum bug fix rate is 0.135 in M01 and the maximum bug fix rate is 0.252 in M13 , so the latter rate is about twice larger than the former one. We did a χ2 test for the differences of bug fix rates in the results. The test confirmed that there are statistically significant differences among the bug fix rates in the categories, at p = 0.0053 < 1% level of significance (χ2 = 16.6; degree of freedom = 5). That is to say, the above categorization (a) Non-commented (b) Commented of methods by focusing on the name of local variables and comments is meaningful for discussing the differences of fault- Fig. 5. Comparison of bug fix rates by category. proneness in the methods. In the categories of non-commented methods M0j (for are likely to be fault-prone regardless of the presence of j = 1, 2, 3), we can observe an increasing trend in the comments. bug fix rate (BFR): BFR(M01 ) = 0.135 < BFR(M02 ) = 0.165 < BFR(M03 ) = 0.211 (see Table VII and Fig.5(a)). F. Results and Discussion: Comparison of Bug Fix Rates over We also identified that the increasing tendency is statisti- Scope cally significant through the Cochran-Armitage test [18] at This subsection compares the bug fix rates among the p = 0.0035 < 1% level of significance (χ2 = 8.52; degree categories from another in-depth perspective of local variable’s of freedom = 1). From this trend, we can say that methods property, “scope.” having representative local variables with shorter names are We first checked correlations of the length of a local likely to be better in terms of fault-proneness, and the ones variable name with its scope. There do not seem to be specific with compound names are worse than others. correlation between the length of local variable’s name and On the other hand, in the categories of commented methods the length of its scope (see Fig.6): Spearman rank-correlation M1j (for j = 1, 2, 3), we cannot identify an increasing trend coefficients in character count and in word count were 0.083 in the bug fix rate; they seems that BFR(M11 ) = 0.180 ≃ and 0.0003, respectively. Hence, the length of a local variable’s BFR(M12 ) = 0.177 < BFR(M13 ) = 0.252 (see Table VII name is statistically independent of the length of its scope, and and Fig.5(b)). the scope is not a confounding factor for discussing the fault- For all three categories, their bug fix rates were higher proneness of methods by using their local variable’s name. than ones of non-commented methods, i.e., BFR(M0j ) < To observe the changes in fault-proneness over variable’s BFR(M1j ) (for j = 1, 2, 3): scope, we computed the moving averages of bug fix rates • BFR(M01 ) = 0.135 < BFR(M11 ) = 0.180, by varying the focusing interval of scope [s − 5, s + 5] for • BFR(M02 ) = 0.165 < BFR(M12 ) = 0.177, and s = 9, 10, . . . , 14; in simplified terms, we obtained the bug fix • BFR(M03 ) = 0.211 < BFR(M13 ) = 0.252. rates of methods whose representative local variable’s scope is “around s” (s ± 5), where the lower and the upper limit of Thus, the commented methods seem to be riskier in fault- s are decided so as to keep the interval [s − 5, s + 5] within proneness than the non-commented methods. Similar trends the scope range of all data: between 4 and 19. For example, in regard to comments have been reported in the previous if s = 9 then [4, 14] is the focusing interval, we focus only work [11], [15] as well. Since programmers might want to add on the methods whose representative local variable’s scope is comments when they considered that their code is difficult to “around 9” (9 ± 5). Figure 7 shows those results. understand without an explanation, the presence of comments In Fig.7(a), we observed the relationships of bug fix rates would be a sign indicating that the code is complicated. regardless of scope: BFR(M01 ) < BFR(M02 ) < BFR(M03 ), Notice that the bug fix rates in the categories of compound names, M03 and M13 , are the highest ones among categories; Only those two categories show bug fix rates which are higher 40 than the average of all (18.1%) (see Fig.5). Thus, the methods 6 having representative local variables with compound names Character Count 30 Word Count 4 20 TABLE VII B UG FIX RATES BY CATEGORY. 2 10 Non-Compound name Compound Category ≤4 >4 name 0 0 0 200 400 600 800 0 200 400 600 800 Non M01 0.135 M02 0.165 M03 0.211 Scope (lines) Scope (lines) 54 87 90 commented ( 401 ) ( 527 ) ( 427 ) (a) Character count (b) Word count M11 0.180 M12 0.177 M13 0.252 25 29 54 Fig. 6. Scatter diagrams: the length of variable’s name vs. the length of Commented ( 139 ) ( 164 ) ( 214 ) variable’s scope. 9 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) which are similar to the results shown in Fig.5(a). Thus, we faults in methods, the presence of a local variable with a can say with emphasis: while the fault-proneness of methods compound name may be a clue finding a risky part from the having representative local variables with shorter names are perspective of the fault-proneness in a method. Such a local low, the methods having representative local variables with variable might have an important role or a more complex role compound names are high. Since the gap in the bug fix rate in the program, so they have to be reviewed more carefully. between M01 and M02 becomes smaller as the scope gets For RQ2, we can say that shorter names are better for local wider, the superiority of a shorter name may be limited to a variables with narrower scope. As a scope gets wider, the posi- narrower scope. If a local variable with a short and simple tive effect of shorter names seems to be decayed. While a short name is used in a wider scope, it might cause an abuse of the and simple name would be preferable as mentioned in some variable or a poor understandability of the program’s behavior. coding conventions and programmers’ heuristics [8], [9], [10], While Kernighan and Pike [10] said to give a short and simple our empirical results quantitatively showed that the variable’s name to a local variable, they did not recommend such a scope is also a feature worthy of consideration. Moreover, the naming in any case, and their argument supposed the case presence of comments may degrade the superiority of shorter that a local variables was used in just “locally” within a part names as their scopes get wider. Therefore, we should take into of a program. The results observed in Fig.7(a) seem to support account not only the composition of local variable’s name but such a programming heuristic. also its scope and comments in the code review. In Fig.7(b), while M11 (≤ 4 letters) are better than M12 (> 4 letters) with narrower scopes around 9 or 10, their H. Threats to Validity magnitude relationship inverts as their scope gets wider. That This empirical analysis has been conducted for Java prod- would be the reason why BFR(M11 ) ≃ BFR(M12 ) in ucts. While another programming language might produce Fig.5(b). Therefore, we can say that a shorter name is better different results, there would not be essential differences in with a narrower scope, but cannot claim a shorter name is the concept of local variables and comments, among Java always better. If programmers wanted to add comments, there and many other modern programming languages. Thus, the would be a lack of clarity in their code. In such a case, a difference in programming language would not be a serious shorter name with a wider scope might spur the program’s threat to validity. poor comprehensibility. On the other hand, compound names In order to avoid the data selection bias, we adopted a always show the worst (highest) bug fix rates regardless of random sampling in our data collection. Moreover, we used scope, similar to the results in Fig.7(a). Although compound popular different sized OSS products from different domains. names are usually descriptive, they seem to be signs of fault- Therefore, our construction of dataset would not be a threat prone methods. If a programmer wanted to give a compound to validity. name to a local variable, the role of the variable would be Since our data is collected from the initial version of somewhat complicated, so methods having such local variables the methods, some methods might be no longer used today. might be riskier than the others in terms of fault-proneness. However, all methods in our dataset are included in the latest version of the product because we made our method list by G. Answers to RQs checking the latest version of their source files as described From the results of Sections III-E and III-F, we summarize in Section III-B. Moreover, we did a random sampling from our findings for RQ1 and RQ2 in the following. them. Thus, we consider it will not be a serious threat to For RQ1, we conclude that methods having local variables validity in our empirical work. with compound names are likely to be faulty regardless of Our definition of compound name is based on the notion scope. Although we do not imply that compound names cause of camel case. If there are local variables whose names are composed by another rule such as the snake case, e.g., “number_of_items,” they are wrongly categorized into non-compound names. Thus, we rechecked all representa- tive local variables’ names included in our data set, then we found only two variables having snake case names, Bug Fix Rate Bug Fix Rate 0.2 0.2 “s_descriptors” and “size_h.” Due to the small num- ber of error cases, our name splitting method was not a serious 0.1 0.1 threat to validity. IV. R ELATED W ORK 0.0 0.0 10 12 14 10 12 14 Lawrie et al. conducted a survey on names of identifiers in Central Scope Central Scope terms of their comprehensibility for over 100 programmers [7]. In their survey research, they classified names of identifiers (a) Non-commented (b) Commented into three categories (1) fully-spelled name, (2) abbreviated name and (3) initial letter—for example, (1) “count,” (2) Fig. 7. Moving averages of bug fix rates over scope. “cnt” and (3) “c”—, then compared their understandability. 10 4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016) Their results showed that fully-spelled names were the easiest One of our significant future works is to conduct further to understand but that there did not seem to be significant analyses of local variables’ names, which include an ap- differences with abbreviated names in their comprehensibility plication of the natural language processing technologies to level. While their work provides a useful motivation to study evaluate the meaning of local variables’ names. A further whether a shorter name is better or not, they did not discuss analysis with products written in a programming language the fault-proneness of program. other than Java is also our future project in order to ensure Kawamoto and Mizuno [19] conducted an empirical study the generality of the above findings. with two OSS products and reported that a class including ACKNOWLEDGMENT a long identifier tends to be faulty. While their work is one of our most significant previous studies, our work focuses This work was supported by JSPS KAKENHI #16K00099. on a finer-grained artifact—local variable—and conducts a The authors would like to thank anonymous reviewers for their statistical analysis with taking into account of the scopes and helpful comments. the comments. R EFERENCES Binkley et al. [20] focused on the relationship be- [1] G. J. Myers, C. Sandler, and T. Badgett, The Art of Software Testing. tween the length of identifier (including a variable’s name, N.J.: John Wiley & Sons, 2004. a method’s name and a class’s name) and the human [2] P. C. Rigby and C. Bird, “Convergent contemporary software peer short-term memory. They identified that identifiers with review practices,” in Proc. 9th Joint Meeting on Foundations of Softw. Eng., Aug. 2013, pp. 202–212. long names are related to a difficulty in program com- [3] P. L. Li, J. Herbsleb, M. Shaw, and B. Robinson, “Experiences prehension. They were concerned that a long chain, e.g., and results from initiating field defect prediction and product test “class.firstAssignment().name.trim(),” would prioritization efforts at ABB Inc.” in Proc. 28th Int’l Conf. Softw. Eng., May 2006, pp. 413–422. cause a loss of the readability of the code. While the research [4] S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, “Benchmarking clas- viewpoint differs from our work, the fundamental concern sification models for software defect prediction: A proposed framework about the length of name is common, and it seems to be well and novel findings,” IEEE Trans. Softw. Eng., vol. 34, no. 4, pp. 485– 496, July 2008. accorded with our results showing the compound names are [5] Y. Liu, T. Khoshgoftaar, and N. Seliya, “Evolutionary optimization not recommended for local variables. of software quality modeling with multiple repositories,” IEEE Trans. Aman et al. [11] reported an empirical analysis showing Softw. Eng., vol. 36, no. 6, pp. 852–864, Nov 2010. [6] F. Rahman and P. Devanbu, “How, and why, process metrics are better,” that Java methods having local variables with long names are in Proc. 2013 Int’l Conf. Softw. Eng., May 2013, pp. 432–441. more likely to be fault-prone and change-prone than the other [7] D. Lawrie, C. Morrell, H. Feild, and D. Binkley, “What’s in a name? a methods. That report is our significant previous work, and this study of identifiers,” in Proc. 14th Int’l Conf. Program Comprehension, June 2006, pp. 3–12. paper focuses more detailed features of local variables, i.e., the [8] Free Software Foundation, “Gnu coding standards,” composition of name and their scopes. While another work by https://www.gnu.org/prep/standards/. Aman et al. [15], reporting that commented programs tend to [9] Sun Microsystems, “Code conventions for the java program- ming language,” http://www.oracle.com/technetwork/java/codeconvtoc- be more fault-prone, is also our important previous work, we 136057.html. conduct a further analysis examining combinations of the local [10] B. W. Kernighan and R. Pike, The practice of programming. Boston, variable’s name, the scope and the comments in this paper. MA: Addison-Wesley Longman, 1999. [11] H. Aman, S. Amasaki, T. Sasaki, and M. Kawahara, “Empirical analysis V. C ONCLUSION of change-proneness in methods having local variables with long names and comments,” in Proc. 2015 ACM/IEEE Int’l Symp. Empirical Softw. We have focused on programming artifacts which may Eng. and Measurement, Oct. 2015, pp. 50–53. vary among individuals: local variables’ names and comments. [12] M. J. Sousa and H. Moreira, “A survey on the software maintenance process,” in Proc. Int’l Conf. Softw. Maintenance, Nov. 1998, pp. 265– Popular code conventions say that names of local variables 274. should be shorter and simple, and it seems to have been [13] R. P. Buse and W. R. Weimer, “A metric for software readability,” in a heuristic of programmers. We empirically evaluated the Proc. 2008 Int’l Symp. Softw. Testing and Analysis, 2008, pp. 121–130. [14] M. Fowler, Refactoring: Improving the Design of Existing Code. heuristic in terms of fault-proneness by checking the names Boston, MA: Addison-Wesley Longman, 1999. of local variables, their scopes and the presence of comments. [15] H. Aman, S. Amasaki, T. Sasaki, and M. Kawahara, “Lines of comments The empirical analysis for the data from nine popular OSS as a noteworthy metric for analyzing fault-proneness in methods,” IEICE Trans. Inf. & Syst., vol. E98-D, no. 12, pp. 2218–2228, Dec. 2015. products showed the following three findings. [16] J. Śliwerski, T. Zimmermann, and A. Zeller, “When do changes induce (1) Local variables with compound names can be signs of fixes?” in Proc. Int’l Workshop on Mining Softw, Repositories, May fault-prone methods. 2005, pp. 1–5. [17] H. Aman, “An empirical analysis of the impact of comment statements (2) Methods having the representative local variables with on fault-proneness of small-size module,” in Proc. 19th Asia-Pacific non-compound and shorter names (≤ 4 letters) are less Softw. Eng. Conf., Dec. 2012, pp. 362–367. fault-prone, but their positive effects are decayed as their [18] A. Agresti, Categorical Data Analysis, 2nd ed. N.J.: Wiley, 2002. [19] K. Kawamoto and O. Mizuno, “Predicting fault-prone modules using the scopes get wider (around 10 or more lines). length of identifiers,” in Proc. 4th Int’l Workshop on Empirical Softw. (3) Methods having comments in their bodies are also more Eng. in Practice, Oct. 2012, pp. 30–34, Japan. likely to be faulty. [20] D. Binkley, D. Lawrie, S. Maex, and C. Morrell, “Identifier length and limited programmer memory,” Science of Computer Programming, These findings are expected to be useful guidelines for more vol. 74, no. 7, pp. 430–445, May 2009. efficient code reviews. 11