4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016)


        Towards improved Adoption: Effectiveness of
             Research Tools in the Real World
             Richa Awasthy                                  Shayne Flint                              Ramesh Sankaranarayana
     Australian National University           Research School of Computer Science               Research School of Computer Science
           Canberra, Australia                   Australian National University                    Australian National University
    Email: richa.awasthy@anu.edu.au                    Canberra, Australia                               Canberra, Australia
                                                     shayne.flint@anu.edu.au                           ramesh@cs.anu.edu.au


   Abstract—One of the challenges in the area of software engi-              provides an overview of related research; Section VII presents
neering research has been the low rate of adoption by industry               conclusions and discusses future research.
of the tools and methods produced by university researchers. We
present a model to improve the situation by providing tangible
evidence that demonstrates the real-world effectiveness of such                                    II. BACKGROUND
tools and methods. A survey of practising software engineers
indicates that the approach in the model is valid and applicable.               Since the 1970’s there have been ongoing efforts to increase
We apply and test the model for providing such evidence and                  the adoption of research outcomes outside of universities [6].
demonstrate its effectiveness in the context of static analysis using        As a result of the United States Bayh Dole Act in 1980
FindBugs. This model can be used to analyse the effectiveness                [7], universities began to establish Technology Transfer Offices
of academic research contributions to industry and contribute
towards improving their adoption.                                            (TTOs) to facilitate the transfer of knowledge from universities
                                                                             to industry [8]. However, the effectiveness of TTOs has been
                       I. I NTRODUCTION                                      questioned in recent years [9], [10] and there is a need to
   The success of software engineering research in universi-                 look beyond TTOs to improve adoption of academic research
ties can be measured in terms of the industrial adoption of                  in industry.
methods and tools developed by researchers. Current adoption                    Researchers in universities are working towards addressing
rates are low [1] and this contributes to a widening gap                     significant problems. The outcome of their work can be a
between software engineering research and practice. Consider,                tool or method which may or may not achieve wide-spread
for example, code inspections, which according to Fagan’s                    industry adoption. A key factor limiting the readiness of these
law, are effective in reducing errors in software, increasing                research outcomes for adoption in industry is a lack of tangible
productivity and achieving project stability [2]. Static analysis            evidence that they would be effective in practice [1]. This
tools developed by university researchers help automate the                  suggests that demonstrating the effectiveness of a tool in
code inspection process. However, the use of such tools has                  practice can contribute to improved adoption.
not obtained widespread adoption in industry. One reason for                    Figure 1 depicts our model for demonstrating the real-
this limited adoption is that researchers often fail to provide              world effectiveness of research tools and methods. The model
real-world evidence that the methods and tools they develop                  involves 4 steps with intermediate activities. First step is
are of potential value to industry practitioners [1], [3].                   to identify a problem to address. Step 2 is to develop a
   One approach to providing such evidence is to conduct                     tool or a method to address the problem. The intermediate
experiments that demonstrate the effectiveness of research                   iterative activity involved between these 2 steps is the process
tools in a real-world context. We apply this approach to                     of solution formulation, which involves adding new ideas to
analyse the effectiveness of a static analysis tool. In doing                the available state of the art. These steps are followed by
so, we demonstrate that such experimentation can contribute                  iterative testing for validation in Step 3. Step 3 confirms
to closing the gap between research and practice.                            the readiness of the research outcome for adoption. An idea
   The structure of this paper is as follows: Section II provides            should be validated in a practical setting [11] to improve its
the background of our work leading to the proposed model;                    adoption. Our model respects this viewpoint and emphasises
Section III presents a survey which shows that real world                    the importance of tangible evidence from a practical setting
evidence can positively influence the decision of software                   in Step 4. Researchers should test their research outcomes
engineers to use research tools; Section IV explains our                     in a scenario that involves real world users who are an
experimental method which uses FindBugs [4] to analyse real                  important stakeholder for industry to increase the relevance of
world bugs in Eclipse [5] code; Section V discusses how                      the evidence for industry. Demonstrating the effectiveness of
simple experiments like ours can encourage more developers                   research outcomes in real-world scenario will lead to change
to use tools developed by researchers and thus contribute to                 in industry perception and improved adoption of the research
closing the gap between research and practice. Section VI                    outcomes.


                                                                        20
                   4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016)

   We test the applicability of this model in the static analysis                                             A. Participants
context by identifying a static analysis tool created by univer-                                                We invited 20 software industry practitioners and around
sity researchers and analysing its effectiveness in a real-world                                              10 computer science researchers with industry experience in
scenario.                                                                                                     software development.

                                     University Research                                                      B. Survey Questions
    1                                2                               3
                                                                                                                 The survey data consisted of responses to the sequence of
                         Solution         Develop a      Ready for        Test in the           Modify
        Problem        Formulation       Tool/method      testing             lab                             questions depicted in Figure 2 and described below.
                                                                     Ready for                                   1) Software engineering experience: We gathered informa-
                                                                     adoption
                                                                     4
                                                                                                              tion about the level of software development expertise of each
                      Improved                                              Test in
                  Industrial Adoption
                                                   Effectiveness
                                                     leading to            the real                           participant so that we can understand any relationship between
                      of the tool                                        environment                          experience and use of static analysis tools.
                                                                                                                 2) Static analysis knowledge: We asked the following ques-
                                                                                                              tions to determine each participant’s level of understanding and
Fig. 1. Proposed model for improving adoption of university research by                                       use of static analysis tools.
industry
                                                                                                                a) ‘Static analysis is the analysis of computer software
                                                                                                                    to find potential bugs without actually executing the
    III. T HE IMPACT OF EVIDENCE FROM REAL - WORLD                                                                  software. Have you heard of static analysis before?’
                                           SCENARIO
                                                                                                                b) ‘Have you used any automated static analysis tools
   In order to understand the impact of evidence from a user-                                                       during software development (e.g. FindBugs, Coverity)?’
scenario on real-world decisions to use a research tool, we                                                   Answers to these questions were used to determine the final
conducted an on-line survey of software developers.                                                           question we asked, as indicated in Figure 2.
   The survey uses static analysis as an example and was
                                                                                                                 3) The impact of the tangible evidence: At the end of the
prepared and delivered using our university’s online polling
                                                                                                              survey each participant who has not used static analysis tools
system [12]. Participants were invited by email which included
                                                                                                              was asked a question to determine the impact that tangible
a link to the on-line survey and a participant information
                                                                                                              evidence (that the tool can identify real bugs early in the
statement. On completion of the survey, we manually analysed
                                                                                                              software development life-cycle) might have on their approach
the results.
                                                                                                              to static analysis. The exact question asked depended on their
                                                                                                              answers to questions described in Section III-B2. Specifically:
                                                                                                                1) Participants who had no knowledge of static analysis were
                                                                            III.B.1
                                                                                                                    asked ‘Would our research results interest you in gaining
                                                                                                                    knowledge of static analysis and adoption of automated
                                                                                                                    static analysis tools?’.
                                                                                                                2) Participants who knew about static analysis but had not
                                                                                                                    used any static analysis tools (our primary group of
                                                                            III.B.2
                                                                                                                    interest) were asked to rate the impact that the following
                                                                                                                    factors would have on their decision to adopt static
                                                                                                                    analysis tools. A Likert scale was used with 5 options (No
                                                                                                                    influence, May be, Likely, Highly Likely, and Definitely).
                                                                                                                   a) Effectiveness of tool in finding bugs
                                                                                                                   b) Ease of use
                                                                                                                   c) Integration of tool to development environment.
                                                                                                                   d) License type.
                                                                                                                   e) The availability of tangible evidence that the tool can
                                                                                                                       identify real bugs early in the software development
                                                                                      III.B.3
                                                                                                                       life-cycle - before they are reported by users.

                                                                                                              C. Analysis of Survey Results
                                                                                                                 The response rate for our survey was high with 27 responses
                                                                                                              out of 30 invitations. Responses to the survey indicate that
                                                                                                              tangible evidence of real-world effectiveness of a tool has
                                                                                                              positive impact on decisions to adopt static analysis tools.
           Fig. 2. Flowchart for the survey questionnaire design                                                 Analysis of the survey results provides the following spe-
                                                                                                              cific findings:


                                                                                                         21
                                   4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016)

  1) Software Engineering Experience - As expected, partici-
     pants had varied level of experience in software devel-                                                                                          7

     opment. However, we do not find any direct relation                                                                                              6

     between the experience and usage of tools.                                                                                                       5


                                                                                                                             Number of participants
  2) Static Analysis Knowledge - Survey results show that                                                                                             4                                                            Effectiveness of tool
                                                                                                                                                                                                                   Ease of use
     9 participants (33%) had no prior knowledge of static                                                                                            3                                                            Integration to IDE
     analysis. It is noteworthy that while the remaining 18                                                                                           2
                                                                                                                                                                                                                   License
                                                                                                                                                                                                                   Tangible evidence
     participants knew about static analysis, only 4 of them                                                                                          1
     had used static analysis tools.
                                                                                                                                                      0
  3) Impact of the tangible evidence - Our survey results show                                                                                            None   May be      Likely     Highly likely Definitely

     that tangible evidence has a positive impact on decisions                                                                                                       Influence of the factor

     to adopt static analysis tools. Out of the 9 respondents
     who had no prior knowledge of static analysis, 8 said that                                                            Fig. 4. Other factors influencing decisions to adopt static analysis tools
     tangible evidence would interest them in gaining knowl-
     edge of static analysis and adopting automated static
     analysis tools. This is a valuable information indicating                                                                                              IV. A PPLICABILITY OF THE M ODEL
     that providing evidence of effectiveness could contribute                                                               In order to test the applicability of our proposed model,
     to improved adoption of research tools in industry.                                                                  we first had to identify an appropriate tool developed by
     Of the 14 participants who had knowledge of static                                                                   researchers and a scenario to test its effectiveness. FindBugs
     analysis but who had not used any tools, 7 participants                                                              version 3.0 was chosen for our research as according to the
     (50%) indicated that tangible evidence would be Highly                                                               tool’s website, there are few organizations using FindBugs
     Likely or would Definitely influence their decision to                                                               [4]. This indicates low adoption of the tool in software
     adopt static analysis tools (Figure 3). Another four par-                                                            industry. Also, it is an open-source static analysis tool with
     ticipants indicated that such evidence would be Likely                                                               a university’s trademark. We conducted an experiment with
     to influence their decision. Considering the response of                                                             the FindBugs static analysis tool to analyse its effectiveness
     three participants as May be, it is possible that they                                                               in the real world. To analyse the effectiveness in real-world,
     respond positively, which will add to the percentage of                                                              we wanted to determine if FindBugs is capable of finding
     participants agreeing that tangible evidence will influence                                                          bugs reported by real users of Eclipse. To do so, we adopted
     the decision.                                                                                                        an approach to establishing a connection between warnings
                                                                                                                          generated by the FindBugs static analysis tool and field bugs
                                                                                                                          reported by Eclipse users on Bugzilla [13] that includes the
                           4.5                                                                                            below steps:
                            4                                                                                               1) Use FindBugs to identify potential bugs in Eclipse class
                           3.5
                                                                                                                               files.
  Number of Participants


                            3
                                                                                                                            2) Search the Eclipse bug-tracking system Bugzilla to iden-
                           2.5
                            2
                                                                                                                               tify bug reports that include stack traces.
                           1.5                                                                                              3) Match the code pointed in Java classes associated with
                            1                                                                                                  FindBugs warnings identified in 1) with code pointed by
                           0.5                                                                                                 stack trace associated with the bugs identified in 2).
                            0
                                 None            May be              Likely          Highly likely      Definitely
                                                                                                                          A. FindBugs
                                        Positive influence on decisions to adopt static analysis tool
                                                                                                                             FindBugs analyses Java class files to report warnings and
                                                                                                                          potential errors in the associated source code. The tool per-
Fig. 3. The impact of tangible evidence on decisions to adopt static analysis
tools                                                                                                                     forms analysis on Java class files without needing access to
                                                                                                                          the source code. It is stable, easy to use, and as mentioned
                                                                                                                          in [14] has higher precision compared to other tools such as
                           As shown in Figure 4, our results also indicate that other                                     CodePro Analytix, PMD and UCDetector. It has a low rate
                           factors such as Ease of use, IDE integration and License                                       of false positives [15], and has more than 400 types of bug
                           type have a positive impact on decisions to adopt static                                       classification along with categorization based on severity level.
                           analysis tools. It is interesting to note that under the May                                   The analysis is based on bug patterns which are classified
                           be and Definitely category, the top two influencing factors                                    into nine categories: Bad Practice, Correctness, Experimen-
                           are License and Tangible evidence.                                                             tal, Internationalization, Malicious Code Vulnerability, Multi-
                                                                                                                          threaded Correctness, Performance, Security, and Dodgy Code.
   Our results clearly show that tangible evidence is an impor-                                                           The warnings reported by FindBugs can be further categorised
tant factor in influencing decisions to adopt research tools in                                                           within the tool as: Scariest (Ranks 1-4), Scary (Ranks 1-9) and
industry.                                                                                                                 Troubling (Ranks 1-14). The category includes all the bugs


                                                                                                                     22
                4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016)

with the ranking mentioned, for example, the ‘Scary’ category                        F. Results of the Experiment
will list the bugs included under the ‘Scariest’ category, as
                                                                                        1) Analysis of FindBugs warnings for Eclipse version 4.4:
well.
                                                                                     Our analysis of FindBugs warnings for Eclipse version 4.4
                                                                                     showed that static analysis of Eclipse version 4.4 generated
B. Eclipse                                                                           warnings in the categories of Correctness (652), Bad Prac-
                                                                                     tice (547), Experimental (1), Multithreaded correctness (390),
   We identified Eclipse as the object of analysis as it is a large
                                                                                     Security (3), and Dodgy code (55) under the rank range of
widely used open source project with good records of user
                                                                                     Troubling (Rank 1-14), as depicted in Figure 5. We focused
reported bugs over many years and versions of the software.
                                                                                     on the Scariest warnings (Rank 1-4), considering them as real
For our experimentation we focused on the analysis of Eclipse
                                                                                     problems requiring attention. There were 82 Scariest warnings
version 4.4 (Luna) because it was the current version at the
                                                                                     in total. These comprised 81 warnings in the Correctness
time of our experimentation.
                                                                                     category and one in the Multi-threaded correctness category.
                                                                                        Additional investigation found that warnings in the Cor-
C. Identification of potential bugs                                                  rectness category included a range of coding errors such as
  Java Jars associated with Eclipse versions 4.4 were analysed                       comparisons of incompatible types, null pointer dereferences
using FindBugs version 3.0. Findbugs generated a list of                             and reading of uninitialized variables.
warnings pertaining to code considered as faulty.
                                                                                                                            Dodgy code
D. Search for user reported bugs
                                                                                                                               Security


                                                                                        Category of warnings
  The Eclipse project uses Bugzilla to track bugs reported                                                     Multithreaded correctness

by users. In order to identify bugs that could be associated                                                              Experimental
with FindBugs warnings, we needed to identify bug reports
                                                                                                                           Bad Practice
that included a documented stack-trace. This was achieved by
                                                                                                                            Correctness
performing an advanced Bugzilla search for bugs that satisfied
                                                                                                                                           0   100    200      300   400   500   600   700
the following criteria:
                                                                                                                                                Number of warnings

   • Version: 4.4,
   • Classification: Eclipse,
                                    1                                                                           Fig. 5. Number of warnings in each category in Eclipse 4.4
   • Status: NEW, ASSIGNED, VERIFIED .

  We then inspected the query results to identify those bug                             2) Connection between FindBugs warnings and user re-
reports that included a documented stack-trace.                                      ported bugs: Execution of the query described in Section
                                                                                     IV-D resulted in a dataset of 2575 bugs, which included
E. Match FindBugs warnings with user reported Bugs                                   347 enhancements. We excluded the enhancements from our
                                                                                     analysis. Out of the remaining bugs, we have analysed 1185
  Our last step was to match warnings generated by FindBugs                          bug reports so far, 90 of which included a documented stack-
(Section IV-C) with user-reported bugs (Section IV-D). This                          trace.
was achieved using the following steps:                                                 We used the method described in Section IV-E to compare
  1) For each of the bugs identified using the procedure                             the stack-trace in these 90 bug reports with the warnings
     described in Section IV-D, we identified the Java class                         generated by FindBugs (Section IV-F1). We found that six
     that was the likely source of the reported bug. This class                      of the user reported bugs could be associated with FindBugs
     was usually the one appearing at the top of the stack-                          warnings as presented in Table I. The data presented in the
     trace. In some cases, we had to traverse through lower                          table includes the Bugzilla Bug ID, a description of the bug,
     levels in the stack-trace to find matching classes.                             and a description of the warning generated by FindBugs.
  2) We then searched for the above classes in the warnings
     generated by FindBugs (Section IV-C) and analysed the                                                                                     V. D ISCUSSION
     code associated with the warning. We did this by using
                                                                                        The model proposed in this paper considers that in order
     the FindBugs class name filter feature to show warnings
                                                                                     to improve the adoption of university research outcomes, re-
     related to the class of interest.
                                                                                     searchers need to demonstrate its effectiveness in a real-world
  3) Finding a matching line of code in the FindBugs warnings
                                                                                     scenario and think about the value of their research outcomes
     establishes a connection between the warnings generated
                                                                                     beyond the lab boundaries. Our main purpose in conducting
     by FindBugs and the bugs reported by users.
                                                                                     the experiment described was to test the applicability of our
  1 Because our experiment looks at the ability of FindBugs to find bugs that
                                                                                     proposed model by analysing the value of a research tool in
have not been fixed, we ignore the bugs with CLOSED status. In addition,             industrial practice. Specifically, we evaluated the performance
we do not consider the possibility of bugs that have been closed incorrectly.        of the FindBugs static analysis tool by analysing its capability


                                                                                23
             4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016)

                                                                     TABLE I
                                     U SER - FILED BUGS IN E CLIPSE WITH ASSOCIATED WARNING IN F IND B UGS

 Bug Id    Problem Description                            FindBugs Warning
 436138    NPE when select SWT.MOZILLA style in           Method call passes null for nonnull parameterThis method call passes a null value for a
           control example                                nonnull method parameter. Either the parameter is annotated as a parameter that should
                                                          always be nonnull, or analysis has shown that it will always be dereferenced. Bug kind
                                                          and pattern: NP - NP NULL PARAM DEREF
 414508    NPE trying to launch Eclipse App               Load of known null value. The variable referenced at this point is known to be null due
                                                          to an earlier check against null. Although this is valid, it might be a mistake (perhaps
                                                          you intended to refer to a different variable, or perhaps the earlier check to see if the
                                                          variable is null should have been a check to see if it was nonnull). Bug kind and pattern:
                                                          NP - NP LOAD OF KNOWN NULL VALUE
 433526    browser.getText() is throwing exception af-    Possible null pointer dereference There is a branch of statement that, if executed, guaran-
           ter Internet Explorer 11 install               tees that a null value will be dereferenced, which would generate a NullPointerException
                                                          when the code is executed. Of course, the problem might be that the branch or statement
                                                          is infeasible and that the null pointer exception can’t ever be executed; deciding that is
                                                          beyond the ability of FindBugs.
                                                          Bug kind and pattern: NP - NP NULL ON SOME PATH
 459025    Can’t right-click items in manifest editor’s   Non-virtual method call passes null for nonnull parameter A possibly-null value is
           extensions tab on OSX                          passed to a nonnull method parameter. Either the parameter is annotated as a parameter
                                                          that should always be nonnull, or analysis has shown that it will always be dereferenced.
                                                          Bug kind and pattern: NP - NP NULL PARAM DEREF NONVIRTUAL
 427421    NumberFormatException        in    periodic    Boxing/unboxing to parse a primitive A boxed primitive is created from a String, just to
           Workspace Save Job                             extract the unboxed primitive value. It is more efficient to just call the static parseXXX
                                                          method.
                                                          Bug kind and pattern: Bx - DM BOXED PRIMITIVE FOR PARSING
 428890    Search view only shows default page (NPE       Possible null pointer dereference There is a branch of statement that, if executed, guaran-
           in PageBookView.showPageRec)                   tees that a null value will be dereferenced, which would generate a NullPointerException
                                                          when the code is executed. Of course, the problem might be that the branch or statement
                                                          is infeasible and that the null pointer exception can’t ever be executed; deciding that is
                                                          beyond the ability of FindBugs.
                                                          Bug kind and pattern: NP - NP NULL ON SOME PATH
 426485    [EditorMgmt][Split editor] Each split causes   Possible null pointer dereference There is a branch of statement that, if executed, guaran-
           editors to be leaked                           tees that a null value will be dereferenced, which would generate a NullPointerException
                                                          when the code is executed. Of course, the problem might be that the branch or statement
                                                          is infeasible and that the null pointer exception can’t ever be executed; deciding that is
                                                          beyond the ability of FindBugs.
                                                          Bug kind and pattern: NP - NP NULL ON SOME PATH

in generating warnings relating to real-world bugs reported by                 searchers need to think about the relevance of the problem they
users of the Eclipse IDE.                                                      are trying to address. These factors indicate that universities
   Our results indicate that FindBugs is capable of identifying                and industry need to start collaborating at an early stage and
bugs that will manifest themselves as bugs reported by users.                  consider co-developing whenever possible and feasible. This
Since real-world evidence would influence the decision to                      can pave a good start towards improving the relevance and
adopt a tool as indicated in the survey results, FindBugs                      adoption of research outcomes in general, and bridging the
needs to improve the percentage of such warnings to make                       gap between industry and academia.
the evidence convincing. Currently, FindBugs does not have
the intelligence to track the bug among the list of false pos-                 A. Limitations and Threats to Validity
itives that can manifest. Our research provides improvement                       Our experiments were limited by the small number of
directions to FindBugs in identifying the important bugs from                  Eclipse bugs reported with a documented stack-trace. It is
the warning base by analysing the historical data and user re-                 important to note that the ability to analyse only 90 of the
quirement. FindBugs results can be improved by introducing a                   1185 bugs considered reflects a limitation of our approach
new ‘user-impact’ category to classify the warnings which will                 and does not reflect a limitation of FindBugs. There are some
potentially have an impact in the client environment and hence,                limitations to the validity of our experiments:
need immediate attention. For this, sufficient information from                  1) FindBugs does not always point to the exact line number
industrial practice needs to be applied into testing the research                   referred to in the stack trace. It might be possible that
tool.                                                                               the source of error could be different from the warnings
   The model appears simple but it is challenging for re-                           provided by FindBugs.
searchers to identify a scenario and approach users and/or data                  2) While it is likely that the FindBugs warnings listed in
to demonstrate the effectiveness of their tool or methods. It                       Table I are the actual cause of the listed real-world bugs
might be difficult for them to find the user-filed data always.                     reported by Eclipse users, we cannot be certain of this.
Also, once they have the data to demonstrate the effectiveness,                  3) As there is lack of literature detailing the use of static
they need a mechanism to propagate it to industry. Also,                            analysis tools like FindBugs by the Eclipse development
the concern about the relevance of research suggests that re-                       team, a comparison study was not feasible.


                                                                          24
              4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016)

 4) This test highlights the relevance of static analysis tools,          low-adoption rates were due to a lack of awareness of static
    though the effectiveness in real-world projects, particu-             analysis tools among developers.
    larly large scale projects, cannot be confirmed as the sam-              None of the work described above analyses the effectiveness
    ple size of our survey is small. However, by considering              of FindBugs in identifying problems that manifest themselves
    the sample sizes of 20 and 18 used in related studies [16],           as real-world bugs reported by users. The experiments de-
    [17], we decided to proceed with our sample size of 27                scribed in this paper analyse the connection between warnings
    participants.                                                         generated by the FindBugs static analysis tool and field defects
 5) The conclusion might not be generalised as the results                reported by Eclipse users on Bugzilla bringing in client into
    are specific to the static analysis context. The proposed             the perspective. The experiments test the applicability of our
    model needs to be validated regarding the tools in other              proposed model in the static analysis context.
    phases of software development.
                                                                                     VII. C ONCLUSION AND F UTURE W ORK
                     VI. R ELATED W ORK
                                                                             We have proposed a model to contribute towards improving
   Adoption of software engineering research outcomes in
                                                                          the adoption of research tools by industry by demonstrating
industry practice has been a concern [1], [18]. There have been
                                                                          the effectiveness of the tool in real world scenario. We have
ongoing efforts to improve the adoption of research outcomes
                                                                          presented a mechanism which involves a research tool as a
since the 1970’s. However, the efforts mainly focused on
                                                                          medium of building tangible evidence. A survey of software
approaches to increase university-industry collaboration for
                                                                          developers supports our hypothesis that such tangible evidence
improving adoption through technology transfer. These efforts
                                                                          of effectiveness of a tool can have a positive influence on real-
include policy changes leading to establishment of TTOs in
                                                                          world decisions to adopt static analysis tools.
universities and proposing models for effective technology
transfer [8], [19]. However, TTOs generally focus on building                Further experiment for testing the applicability of the model
collaborative relationships between researchers and industry              in the static analysis context was conducted. In this experi-
[20] rather than the readiness of research outcomes for adop-             ment, by establishing a connection between user-reported bugs
tion in industry. In this paper we demonstrated how some                  and warnings generated by the FindBugs static analysis tool,
simple experiments can analyse the effectiveness of software              we have demonstrated the ability of static analysis tools to
engineering research tools in practice. Specifically we analysed          eliminate some defects before software is deployed. However,
the effectiveness of a static analysis tool.                              the evidence needs to be stronger regarding the number of such
   Various experiments have been conducted to demonstrate                 connections in order to be more convincing and improving the
the effectiveness of the FindBugs static analysis tool by show-           industrial adoption of the tool.
ing that it is able to detect the specific problems it has been              Future research would present more detailed analysis of
designed to detect [15], [21], [22]. However, our experiment              the complete list of the bugs found in Section IV-F2, which
has been conducted in unconstrained environment that involves             will provide us precise data about the effectiveness of the
real-world scenario that has impacted clients. Ruthruff et al.            tool according to our approach. Our approach also presents
[23] involved developers in determining which reports were                a scenario where industry and university researchers can work
useful and which ones were not. This information was used                 together to create more useful tools. We plan to discuss these
to filter FindBugs warnings so that only those that developers            results with the FindBugs development team to explore the
found useful were reported. We retrace the user-filed bug to              possibility of strengthening the evidence and devising a new
the warning generated by the static analysis tool. This can               classification user-impact to indicate the warnings that would
pave way to create an intelligent mechanism to prioritise bugs            manifest in client-environment.
based on user-impact.                                                        Finally, we would like to adapt this approach to explore the
   Al Bessey et al. [24] identify several factors that impacted           effectiveness of research tools involved in other phases of the
the industry adoption of their static analysis tool Coverity [25].        software development life-cycle.
They include trust between the researchers and industry users,
the number of false positives, and the capability of the tool                                         R EFERENCES
to provide warnings relating to problems which have had a                  [1] D. Rombach and F. Seelisch, “Balancing agility and formalism in
significant impact on its users. Our work also confirms that                   software engineering,” B. Meyer, J. R. Nawrocki, and B. Walter, Eds.
                                                                               Berlin, Heidelberg: Springer-Verlag, 2008, ch. Formalisms in Software
tool’s capability is important. It also identifies that licensing,             Engineering: Myths Versus Empirical Facts, pp. 13–25.
IDE integration and ease of use are significant factors.                   [2] A. Endres and H. D. Rombach, A handbook of software and systems
   Johnson et al. [16] investigated the reasons behind the                     engineering: empirical observations, laws and theories.       Pearson
low adoption rate of static-analysis tools despite their proven                Education, 2003.
                                                                           [3] M. Ivarsson and T. Gorschek, “A method for evaluating rigor and
benefits in finding bugs. Their investigations confirmed that                  industrial relevance of technology evaluations,” Empirical Software
large numbers of false positives is a major factor in low                      Engineering, vol. 16, no. 3, pp. 365–395, 2011.
adoption rates. We note that their findings were based on the              [4] University of Maryland, “Findbugs,” viewed May 2015,
                                                                               http://findbugs.sourceforge.net, 2015.
survey of developers who had all used static analysis tools.               [5] The Eclipse Foundation, “Eclipse,” viewed May 2015,
This meant that authors were not able to comment on whether                    http://www.eclipse.org, 2015.


                                                                     25
                4th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2016)

 [6] R. Grimaldi, M. Kenney, D. S. Siegel, and M. Wright, “30 years after
     bayh–dole: Reassessing academic entrepreneurship,” Research Policy,
     vol. 40, no. 8, pp. 1045–1057, 2011.
 [7] W. H. Schacht, “Patent ownership and federal research and development
     (R&D): A discussion on the Bayh-Dole act and the Stevenson-Wydler
     act.” Congressional Research Service, Library of Congress, 2000.
 [8] D. S. Siegel, D. A. Waldman, L. E. Atwater, and A. N. Link, “Com-
     mercial knowledge transfers from universities to firms: improving the
     effectiveness of university–industry collaboration,” The Journal of High
     Technology Management Research, vol. 14, no. 1, pp. 111–133, 2003.
 [9] J. G. Thursby, R. Jensen, and M. C. Thursby, “Objectives, characteristics
     and outcomes of university licensing: A survey of major us universities,”
     The Journal of Technology Transfer, vol. 26, no. 1-2, pp. 59–72, 2001.
[10] D. S. Siegel, D. A. Waldman, L. E. Atwater, and A. N. Link, “Toward
     a model of the effective transfer of scientific knowledge from academi-
     cians to practitioners: qualitative evidence from the commercialization
     of university technologies,” Journal of Engineering and Technology
     Management, vol. 21, no. 1, pp. 115–142, 2004.
[11] R. L. Glass, “The relationship between theory and practice in software
     engineering,” Communications of the ACM, vol. 39, no. 11, pp. 11–13,
     1996.
[12] The Australian National University, “Anu polling online,” viewed July
     2015, ¡https://anubis.anu.edu.au/apollo/¿, 2015.
[13] Creative Commons License, “bugzilla,” viewed May 2015,
     https://www.bugzilla.org, 2015.
[14] A. K. Tripathi and A. Gupta, “A controlled experiment to evaluate the
     effectiveness and the efficiency of four static program analysis tools for
     java programs,” in Proceedings of the 18th International Conference
     on Evaluation and Assessment in Software Engineering. ACM, 2014,
     p. 23.
[15] D. Hovemeyer and W. Pugh, “Finding bugs is easy,” ACM Sigplan
     Notices, vol. 39, no. 12, pp. 92–106, 2004.
[16] B. Johnson, Y. Song, E. Murphy-Hill, and R. Bowdidge, “Why don’t
     software developers use static analysis tools to find bugs?” in Software
     Engineering (ICSE), 2013 35th International Conference on. IEEE,
     2013, pp. 672–681.
[17] L. Layman, L. Williams, and R. S. Amant, “Toward reducing fault fix
     time: Understanding developer behavior for the design of automated
     fault detection tools,” in Empirical Software Engineering and Measure-
     ment, 2007. ESEM 2007. First International Symposium on. IEEE,
     2007, pp. 176–185.
[18] S. Beecham, P. OLeary, I. Richardson, S. Baker, and J. Noll, “Who are
     we doing global software engineering research for?” in 2013 IEEE 8th
     International Conference on Global Software Engineering. IEEE, 2013,
     pp. 41–50.
[19] S. L. Pfleeger, “Understanding and improving technology transfer in
     software engineering,” Journal of Systems and Software, vol. 47, no. 2,
     pp. 111–124, 1999.
[20] D. S. Siegel, R. Veugelers, and M. Wright, “Technology transfer offices
     and commercialization of university intellectual property: performance
     and policy implications,” Oxford Review of Economic Policy, vol. 23,
     no. 4, pp. 640–660, 2007.
[21] N. Ayewah, W. Pugh, J. D. Morgenthaler, J. Penix, and Y. Zhou,
     “Using findbugs on production software,” in Companion to the 22nd
     ACM SIGPLAN conference on Object-oriented programming systems
     and applications companion. ACM, 2007, pp. 805–806.
[22] ——, “Evaluating static analysis defect warnings on production soft-
     ware,” in Proceedings of the 7th ACM SIGPLAN-SIGSOFT workshop
     on Program analysis for software tools and engineering. ACM, 2007,
     pp. 1–8.
[23] J. R. Ruthruff, J. Penix, J. D. Morgenthaler, S. Elbaum, and G. Rother-
     mel, “Predicting accurate and actionable static analysis warnings: an
     experimental approach,” in Proceedings of the 30th international con-
     ference on Software engineering. ACM, 2008, pp. 341–350.
[24] A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-
     Gros, A. Kamsky, S. McPeak, and D. Engler, “A few billion lines
     of code later: using static analysis to find bugs in the real world,”
     Communications of the ACM, vol. 53, no. 2, pp. 66–75, 2010.
[25] Synposys Inc., “Coverity,” viewed May 2015, http://www.coverity.com,
     2015.


                                                                                  26