6th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2018) Report on the 6th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2018) Horst Lichter Thanwadee Sunetnanta Toni Anwar Taratip Suwannasart RWTH Aachen University Mahidol University Universiti Teknologi Chulalongkorn University PETRONAS Germany Thailand Malaysia Thailand lichter@swc.rwth-aachen.de thanwadee.sun@mahidol.ac.th toni.anwar@utp.edu.my taratip.s@chula.ac.th I. INTRODUCTION devoted to empirically evaluate risks, efficiency or limitations Based on the feedback and experience we got from the 5 th of different testing techniques in industrial settings. workshop we slightly adjusted the list of topics for the workshop Hence, one main goal of the workshop was to exchange and planned to invite a keynote speaker again. The topics of experience, present new promising approaches and to discuss interest included how to set up, organize, and maintain quantitative approaches to software quality. • New approaches to measurement, evaluation, comparison and improvement of software quality II. WORKSHOP FORMAT • Metrics and quantitative approaches in agile projects Based on our former experience we wanted the workshop to be highly interactive. In order to have an interesting and interactive • Case studies and industrial experience reports on event sharing lots of experience, we organized the workshop successful or failed application of quantitative presentations applying the author-discussant model. approaches to software quality Based on this workshop model, papers are presented by one • Tools, infrastructure and environments supporting of the authors. After the presentation, a discussant starts the quantitative approaches discussion based on his or her pre-formulated questions. • Empirical studies, evaluation and comparison of Therefore, the discussant had to prepare a set of questions and measurement techniques and models had to know the details of the presented paper. The general structure of each talk was as follows: • Quantitative approaches to test process improvement, test strategies or testability • The author of a paper presented the paper (20 minutes). • Empirical evaluations or comparisons of testing • After that, the discussant of the paper opened the techniques in industrial settings discussion using his or her questions (5 minutes). Overall, the workshop aimed at gathering together • Finally, we moderated the discussion among the whole researchers and practitioners to discuss experiences in the audience (5 minutes). application of state of the art approaches to measure, assess and evaluate the quality of both software systems as well as software III. INVITED TALK development processes in general and software test processes in This year we were happy to have Prof. Hongyu Zhang as our particular. invited speaker. Hongyu Zhang is currently an Associate Professor at The University of Newcastle, Australia. Previously, As software development organizations are always forced to he was a Lead Researcher at Microsoft Research Asia and an develop software in the "right" quality, the quality specification Associate Professor at Tsinghua University, China. He received and quality assurance are crucial. Although there are lots of his PhD degree from National University of Singapore in 2003. approaches to deal with quantitative quality aspects, it is still His research is in the area of Software Engineering, in particular, challenging to choose a suitable set of techniques that best fit to software analytics, testing, maintenance, and reuse. The main the specific project and organizational constraints. theme of his research is to improve software quality and Even though approaches, methods, and techniques are productivity by mining software data. He has published more known for quite some time now, little effort has been spent on than 120 research papers in international journals and the exchange on the real-world problems with quantitative conferences, including TSE, TOSEM, ICSE, FSE, POPL, approaches. For example, only limited research has been AAAI, KDD, IJCAI, ASE, ISSTA, ICSM, ICDM, and USENIX. He received two ACM Distinguished Paper awards. Copyright © 2018 for this paper by its authors. 1 6th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2018) He also served as a program chair and committee member for B. Chao Zhang, Weiliang Yin and Zhiqiang Lin: Boost many software engineering conferences. He is on the Editorial Symbolic Execution Using Dynamic State Merging and Board of Journal of Systems and Software, and is a Senior Forking Member of IEEE. Symbolic execution has achieved wide application in software Prof. Hironori Washizaki presented in his talk entitled testing and analysis. However, path explosion remains the “Intelligent Fault Diagnosis and Prediction through Data bottleneck limiting scalability of most symbolic execution Analytics” important insights how the analysis of big data can engines in practice. One of the promising solutions to address support the prediction of faults in systems to support managers this issue is to merge explored states and decrease number of taking the right decisions before releasing a software system. paths. Nevertheless, state merging leads to increase in complexity of path predicates at the same time, especially in the IV. WORKSHOP CONTRIBUTIONS situation where variables with concrete values are turned Altogether ten papers were submitted. Finally, nine papers h symbolic and chances of concretely executing some statements accepted by the program committee for presentation and are dissipated. As a result, calculating expressions and publication covering very different topics. We grouped the constraints becomes much more time consuming and thus, the papers into three sessions and added a final round-up slot to performance of symbolic execution is weakened in contrast. To present and discuss the major findings of our workshop. In the resolve the problem, we propose a merge-fork framework following we want to give a short overview of the accepted enabling states under exploration to switch automatically papers. between merging mode and forking mode. First, active state forking is introduced to enable forking a state into multiple ones A. Yeongjun Cho, Jung-Hyun Kwon, In-Young Ko: Cross- as if a certain merging action taken before were eliminated. Sub-Project Just-in-Time Defect Prediction on Multi-Repo Second, we perform dynamic merge fork analysis to cut source Projects code into pieces and continuously evaluate efficiency of Just-in-time (JIT) defect prediction, which predicts defect- different merging strategies for each piece. Our approach inducing code changes, can provide faster and more precise dynamically combines paths under exploration to maximize feedback to developers than traditional module-level defect opportunities for concrete execution and ease the burden on prediction methods. We find that large-scale projects such as underlying solvers. We implement the framework on the Google Android and Apache Maven divide their projects into foundation of the symbolic execution engine KLEE, and multiple sub-projects, in which relevant source code is managed conduct experiments on GNU Core utils code using our separately in different repositories. Although sub-projects tend prototype to present the effect of our proposition. Experiments to suffer from a lack of the historical data required to build a show up to 30% speedup and 80% decrease in queries compared defect prediction model, the feasibility of applying cross- to existing works. subproject JIT defect prediction has not yet been studied. A C. Konrad Fögen and Horst Lichter: A Case Study on cross-sub- project model to predict bug-inducing commits in the Robustness Fault Characteristics for Combinatorial target sub-project could be built with data from all other sub- Testing - Results and Challenges projects within the project of the target sub-project, or data from the subprojects of other projects, as traditional project-level JIT Combinatorial testing is a well-known black-box testing defect prediction methods. Alternatively, we can rank sub- approach. Empirical studies suggest the effectiveness of projects and select high-ranked sub-projects within the project combinatorial coverage criteria. So far, the research focuses on to build a filtered-within-project model. In this work, we define positive test scenarios. But, robustness is an important a subproject similarity measure based on the number of characteristic of software systems and testing negative scenarios developers who have contributed to both sub-projects to rank is crucial. Combinatorial strategies are extended to generate sub-projects. We extract the commit data from 232 sub-projects invalid test inputs but the effectiveness of negative test scenarios across five different projects and evaluate the cost effectiveness is yet unclear. Therefore, we conduct a case study and analyze of various cross-sub-project JIT defect prediction models. Based 434 failures reported as bugs of an financial enterprise on the results of the experiments, we conclude that 1) cross-sub- application. As a result, 51 robustness failures are identified project JIT defect prediction generally has better cost including failures triggered by invalid value combinations and effectiveness than within-sub-project JIT defect prediction, failures triggered by interactions of valid and invalid values. especially when the sub-projects from the same project are used Based on the findings, four challenges for combinatorial as training data; 2) in filtered-within-project JIT defect- robustness testing are derived. prediction models, the developer similarity-based ranking can D. Séverine Sentilles, Efi Papatheocharous and Federico achieve higher cost effectiveness than the other ranking Ciccozzi: What do we know about software security methods; and 3) although a developer similarity-based filtered- evaluation? A preliminary study within-project model achieves lower cost effectiveness than a within-project model in general, we find that there is room for In software development, software quality is nowadays further improvement to the filtered-within-project model that acknowledged to be as important as software functionality and may outperform the within-project model.. there exists an extensive body-of-knowledge on the topic. Yet, software quality is still marginalized in practice: there is no consensus on what software quality exactly is, how it is achieved and evaluated. This work investigates the state-of-the-art of software quality by focusing on the description of evaluation Copyright © 2018 for this paper by its authors. 2 6th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2018) methods for a subset of software qualities, namely those related encouraging, and we suggest that adoption of VSM with Scrum to software security. The main finding of this paper is the lack could add more value to the Scrum-based projects. of information regarding fundamental aspects that ought to be specified in an evaluation method description. This work G. Ankush Dadwal, Hironori Washizaki, Yoshiaki Fukazawa, follows up the authors’ previous work on the Property Model Takahiro Iida, Masashi Mizoguchi and Kentaro Ontology by carrying out a systematic investigation of the state- Yoshimura: Prioritization in Automotive Software Testing: of-the-art on evaluation methods for software security. Results Systematic Literature Review show that only 25% of the papers studied provide enough Automotive Software Testing is a vital part of the automotive information on the security evaluation methods they use in their systems development process. Not identifying the critical safety validation processes, whereas the rest of the papers lack issues and failures of such systems can have serious or even fatal important information about various aspects of the methods consequences. As the number of embedded systems and (e.g., benchmarking and comparison to other properties, technologies increases, testing all components becomes more parameters, applicability criteria, assumptions and available challenging. Although testing is expensive, it is important to implementations). This is a major hinder to their further use. reduce bugs in an early stage to maintain safety and to avoid recalls. Hence, the testing time should be reduced without E. Maohua Gan, Kentaro Sasaki, Akito Monden and Zeynep impacting the reliability. Several studies and surveys have Yucel: Generation of Mimic Software Project Data Setsfor prioritized Automotive Software Testing to increase its Software Engineering Research effectiveness. The main goals of this study are to identify: (i) the To conduct empirical research on industry software publication trends of prioritization in Automotive Software development, it is necessary to obtain data of real software Testing, (ii) which methods are used to prioritize Automotive projects from industry. However, only few such industry data Software Testing, (iii) the distribution of studies based on the sets are publicly available; and unfortunately, most of them are quality evaluation, and (iv) how existing research on very old. In addition, most of today’s software companies cannot prioritization helps optimize Automotive Software Testing. make their data open, because software development involves many stakeholders, and thus, its data confidentiality must be H. Reishi Yokomori, Norihiro Yoshida, Masami Noro and strongly preserved. This paper proposes a method to artificially Katsuro Inoue: Use-Relationship Based Classification for generate a “mimic” software project data set whose Software Components characteristics (such as average, standard deviation and In recent years, the maintenance period of the software system correlation coefficients) are very similar to a given confidential is increasing. The size of the software system has grown, and the data set. The proposed method uses the Box–Muller method for number of classes and the relationship between classes are also generating normally distributed random numbers, then, increasingly complicated. If we can categorize software exponential transformation and number reordering are used for components based on information such as functions and roles, data mimicry. Instead of using the original (confidential) data we believe that these classified components can be understood set, researchers are expected to use the mimic data set to produce together, and are useful for understanding the system. In this similar results as the original data set. To evaluate the usefulness paper, we proposed a classification method for software of the proposed method, effort estimation models were built components based on similarity of use relation. For each from an industry data set and its mimic data set. We confirmed component, a set of components used by the component was that two models are very similar to each other, which suggests analyzed. And then, for each pair of components, the distance the usefulness of our proposal. was calculated from the coincidence of the two sets. A distance matrix was created and components were classified by F. Nayla Nasir and Nasir Mehmood Minhas: Implementing hierarchical cluster analysis. We applied this method to jlGui Value Stream Mapping in a Scrum-based project - An consisting of 70 components. 8 clusters of 36 components were Experience Report extracted from the 70 components. Characteristics of the The value stream mapping is one of the lean practices, that helps extracted clusters were evaluated, and the content of each cluster to visualize the whole process and identifies any bottlenecks was introduced as a case study. In 7 clusters out of the 8 clusters, affecting the flow. Proper management of the value stream can components of the cluster were strongly similar with each other significantly contribute towards waste elimination by from the viewpoint of their functions. Through these categorizing process activities to be either value adding or non experiments, we confirmed that our method is effective for value-adding. Lean development focuses on the value through classifying components of the target software, and is useful for the elimination of waste. Adding value through embracing understanding them. change and customer satisfaction are also the benefits of Scrum. This study reports our experience regarding the implementation V. SUMMARY OF THE DISCUSSIONS of VSM with Scrum. We followed the action research method, About 30 researchers attended the workshop and participated in with an objective to see if VSM can contribute to the the discussions. The author-discussant model was well received identification and reduction of wastes in a Scrum-based project. by the participants and led to intensive discussions among them. We identified a noticeable amount of waste even with strict compliance to the Scrum practices. On the basis of identified For instance, the discussion of paper D (Séverine Sentilles et al) waste, their root causes, and possible mitigation strategy we focused on issues regarding the categorization of existing have proposed a future state map, that could help improve the literatures on their evaluation methods related to software productivity of the process. The results of our study are security into three groups based on their main purpose. The first group focuses on defining a new property or metric. The second Copyright © 2018 for this paper by its authors. 3 6th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2018) group bases their work on already defined properties. Finally, supported the workshop by soliciting papers and by writing peer the last group of the literatures are not explicitly referring to any reviews: property, method or metric. • Matthias Vianden, Aspera GmbH, Aachen, Germany Another example, the discussion on paper G (Ankush Dadwal et al.) focused on issues why automotive software testing was • Wan M.N. Wan Kadir, UTM Johor Bahru, Malaysia chosen rather than software testing on other applications. The • Maria Spichkova, RMIT University, Melbourne, testing methods presented were so debatable among the Australia discussants and the attendants. • Tachanun Kangwantrakool, ISEM, Thailand The last discussion of the workshop was about classification methods for software components based on similarity of use • Jinhua Li, Qingdao University, China relation (Reishi Yokomori et al.). The attendants had some issues on similarity of use relation. The similarity of software • Apinporn Methawachananont, NECTEC, Thailand components can be calculated using silhouette coefficient. • Nasir Mehmood Minhas, BTH Karlskrona, Sweden However, this led to interesting discussions how similar software components are clustered. • Chayakorn Piyabunditkul, NSTDA, Thailand To conclude, in the course of this workshop the participants • Sansiri Tanachutiwat, Thai German Graduate School of proposed and discussed different approaches to quantify Engineering, TGGS, Thailand relevant aspects of software development. Especially the • Hironori Washizaki, Waseda University, Japan discussions led to new ideas, insights, and take-aways for all participants. • Hongyu Zhang, University of Newcastle, Australia VI. ACKNOWLEDGMENTS • Minxue Pan, Nanjing University, China Many people contributed to the success of this workshop. First, Finally, the QuASoQ organizers would like to express their we want to give thanks to our invited speaker and the authors deepest gratitude to our colleague Ashish Sureka, who passed and presenters of the accepted papers. Furthermore, we want to away early this year, for his continuous support and valuable express our gratitude to the APSEC 2018 organizers; they did a contributions. perfect job. Finally, we are glad that these people served on the program committee (some of them for many years) and Copyright © 2018 for this paper by its authors. 4