Mining Extension Point Patterns in Scala Yunior Pacheco1,2 , Jonas De Bleser1 , Tim Molderez1 , Dario Di Nucci1 , Wolfgang De Meuter1 , Coen De Roover1 1 2 Vrije Universiteit Brussel Pinar del Rio University Brussels, Belgium Pinar del Rio, Cuba {ypacheco, jdeblese, tmoldere, ddinucci, wdmeuter, cderoove}@vub.be Abstract—To use a framework, developers often need to hook framework and to provide recommendations to developers in into and customize some of its functionality. These customizations this respect. are often made by instantiating a type provided by the frame- In this paper, we describe the transposition from Java to work, or by extending or implementing a framework type and instantiating this subtype, and providing the resulting object to Scala of a technique by Asaduzzaman et al. [9] for mining other framework objects. Recommending extension patterns that framework extension point patterns. Such patterns frequently frequently occur at such extension points can help developers to occur at the points where a framework supports being extended adopt a new framework correctly and to exploit it fully. by its users, such as a public method that takes an instance of In this paper we transpose an existing technique for mining a framework subtype defined by the users. Once mined for in extension patterns in Java projects to the Scala context. Our goal is to evaluate whether the unique features of the Scala language a sufficient number of projects, extension point patterns can have an impact on the mined extension patterns. To this aim, form the basis for recommendations to other projects that use we propose SCALA - XP - MINER, a tool for mining extension point the framework. To this end, we present SCALA - XP - MINER, patterns in Scala projects. We preliminary evaluate SCALA - XP - a tool for mining extension point patterns in Scala projects. MINER on a corpus of 9 projects using the S WING framework. SCALA - XP - MINER implements a transposition to Scala of the Our first results reveal that extension points are not very diffused in Scala projects using S WING and that only one type of extension technique introduced by Asaduzzaman et al. [9]. points is adopted by developers. We motivate the transposition not only by the traction that Index Terms—Framework; Extension Points, Usage Patterns, Scala is gaining in industry, but also by the unique features of Graph Mining, Scala Language, Mining Software Repository this programming language that may impact the patterns that occur at framework extension points. We therefore use SCALA - I. I NTRODUCTION XP - MINER in a preliminary study with the aim of analysing the diffusion and the characteristics of the extension points A significant part of software development consists of in Scala projects. In particular, we analysed 9 projects that becoming familiar with APIs from different libraries and use the S CALA -S WING framework, a Scala wrapper for the frameworks. Libraries and frameworks enable code reuse, famous S WING GUI framework. Our first results reveal that provide high-level abstractions for common tasks, and help extension points are not very diffused in our context and that unify the programming experience [1]. In many cases the flex- only one type of extension point is adopted by developers. ibility offered by large libraries and frameworks is achieved Structure of the paper. The remainder of this paper is at the expense of sophisticated APIs that must be accessed organised as follows: Section II describes the work of Asaduz- by combining API elements into usage patterns, while taking zaman et al. [9], including the definition of extension points into account constraints and specialised knowledge about the and extension point patterns. Section III presents an overview behaviour of the API [2]. Thus, making efficient use and of the approach, while section IV discusses the results on exploiting all the possibilities offered by libraries and frame- an initial dataset of Scala projects. Finally, limitations and works can be quite difficult. The larger and more sophisticated challenges are discussed in section V. the library or framework, the harder this challenge, due to specific requirements and relations between its components II. M INING E XTENSION P OINT PATTERNS that are difficult to understand. Furthermore, the library or framework may not be documented completely or clearly [1]. E XTENSION P OINTS have been defined by Asaduzzaman When using a framework, it is either necessary or com- et al. as means provided by a framework, that allow devel- mon to extend its functionality. Several studies analysed how opers to customise its behaviour, to meet application specific developers use libraries in software systems; providing tools requirements. A common way to extend a framework is to to explore and navigate usage examples [3]–[5], documenting pass one of its objects as an argument to a framework call [9]. techniques [6], and recommending usage patterns obtained Such an argument may be created by subclassing a framework from mining code examples ( [7], [8]). However, there are class, implementing a framework interface, or customising the relatively few studies that focus on common ways to extend a properties of an existing framework object. Nevertheless, there are other ways of extending a framework, but we only consider Figure 3 shows an extension point pattern common to, the one Asaduzzaman et al. proposed. among others, the extension point usage graph in Figure 2 Thus, an extension point comprises a framework method and thus the code in Figure 1(a). In particular, it shows how of which the parameters are related to the framework itself. the method addActionListener is used to extend the An EXTENSION POINT USAGE is a call to an extension point behaviour of the Button class by i) creating an instance of a method, of which at least one of its arguments is either an client class that overrides the method actionPerformed, instance of a type provided by the framework, or an instance and ii) passing it as the argument to the method call. It is worth of a user-defined type that extends a framework type. noting that the extension point pattern needs to occur more Figure 1(a) depicts an example of extension point usage in frequently in the mined extension point usage graphs than a Java. The developer is adding a listener to a framework object. threshold to which the mining algorithm has been configured. The addActionListener method declared in the Button III. OVERVIEW OF THE FRAMEWORK class, defines a parameter of type ActionListener; in Our framework for mining EXTENSION POINT PATTERNS this case, the framework is extended by i) implementing the in Scala projects consists of three components: a source code ActionListener interface in the MyActionListener importer, a pattern miner, and a visualisation tool. Figure 4 class and ii) calling the addActionListener method depicts the interactions between these components. with an instance of the class as argument. The behaviour of We follow a 3-step approach similar to the one proposed in the ActionListener was customised, by overriding the the reference work [9]. First, we build a graph for each of the method actionPerformed. extension point usages in the Scala projects that depend on Developers can analyse existing projects to find examples the framework under analysis. Next, we mine extension point of extension points usage, which can be a time-consuming patterns using the information previously extracted. Finally, task. Thus, Asaduzzamann et al. [9] propose an approach to we visualise the input and output of the mining algorithm in automatically locate extension points and mine extension point a way that allows the developer to browse and understand the usage. The approach represents each extension point usage as results. a separate graph. An EXTENSION POINT USAGE GRAPH consists of several Source Code Importer. The importer takes the source code types of nodes: receiver type, method call, parameter type, ar- of framework clients as input and collects information on gument type, other method calls, extended class, implemented framework usages through a static analysis. interface, overriding method. To build the extension point For each framework method call, the importer checks usage graphs, it is necessary to parse and analyse the source whether it corresponds to an extension point. To be considered, code of the project that uses the framework. Asaduzzamann et the method call must have at least one parameter that is related al. use E CLIPSE JDT to this end. The extension point usage to a framework type. For each extension point, we collect the graph shown in Figure 2 illustrates the graph representation method name, the return type, and the types of the parameters. of the Java code in Figure 1(a). To construct the extension point usage graphs, the importer The subgraphs that occur most frequently in the extension resolves the types of the receiver and the arguments to the point usage graphs are called EXTENSION POINT PATTERNS. extension point and determines their type hierarchy and the These patterns are useful to describe how an extension point is overridden methods in this type hierarchy. It also identifies commonly used. In summary, the approach defined by Asaduz- method calls of which the receiver is either the same receiver zamann et al. [9] generates an extension point usage graph of the extension point or refers to one of its arguments. for each extension point usage. The graphs are processed, The importer extracts the required syntactic and semantic using a frequent subgraph mining algorithm, to extract the information through the SCALA - META1 library. extension patterns. The patterns are then grouped in categories Figure 1(b) shows how to add a listener to a framework according to a taxonomy defined by the authors that describe object in Scala using the SCALA - SWING framework. In this the complexity of the pattern: example, the extension point is the listenTo method, de- fined in the framework trait Reactor, implemented by the (i) S IMPLE: an instance of a framework class is passed as MainFrame framework class. This method takes a variable an argument to the extension point without modifying it; number of objects of type Publisher as its arguments. In (ii) C USTOMISE: before passing the argument of a framework this case the argument is an instance of class Button, a type to the extension point, a number of state changing class of the framework. The object was passed to the method methods are called on it; call without making changes to its state, that is, without (iii) E XTEND: the argument to the extension point is an calling methods on the Button object before the call to instance of a new class that extends a framework class. the extension point. Figure 5 shows the extension point usage (iv) I MPLEMENT: the argument to the extension point is an graph generated for the Scala code example in Figure 1(b). instance of a new class that implements a framework class. Extension Point Patterns Miner. The miner is responsible for mining the extension point patterns. The frequent subgraph Finally, for each category, the patterns that most frequently occur in the codebase are shown. 1 https://github.com/scalameta/scalameta 1 class MyActionListener implements ActionListener { 1 class MyFrame extends MainFrame { 2 2 3 @Override 3 val convertButton = new Button { 4 public void actionPerformed(ActionEvent e) { 4 text = "Convert" 5 System.out.println("Tickles!"); 5 } 6 } 6 7 7 this.listenTo(convertButton) 8 } 8 9 9 reactions += { 10 class MyFrame extends JFrame { 10 case ButtonClicked(_) => println "Tickles!" 11 11 } 12 private MyFrame() { 12 13 Button convertButton = new Button("Convert"); 13 } 14 convertButton.addActionListener(new MyActionListener()); 15 16 } (b) 17 } (a) Fig. 1: Example extension point usage of the S WING framework in Java (a) and the S CALA -S WING framework in Scala (b). Button IV. P RELIMINARY E VALUATION method_call This section presents the design and the results of the addActionListener() preliminary study that we conducted to analyse extension point parameter patterns in Scala projects. ActionListener argument A. Design MyActionListener We follow the process defined by Wohlin et al. [11] to Implements override conduct this study. The goal of our study is to analyse (i) to ActionListener actionPerformed what extent extension points occur and (ii) to which category Fig. 2: Example of an EXTENSION POINT USAGE GRAPH of those defined by Asaduzzaman et al. [9] they belong. The representing the code in Figure 1(a) purpose of this study is to collect extension points of the SCALA - SWING library in order to recommend frequently used patterns to developers. Based on this study, we aim to answer method_call parameter argument override the following research questions: Button addActionListener() ActionListener Client actionPerformed RQ1 : To what extent do extension points occur in Fig. 3: EXTENSION POINT PATTERN extracted from Figure 2 SCALA - SWING projects? RQ2 : What kind of extension patterns occur and which are the most frequently used for the SCALA - mining algorithm, used in the miner, takes as input the set of SWING library? extension point usage graphs generated by the importer in the previous step. The frequent subgraph mining algorithm is a TABLE I: Dataset Characteristics variant of the Apriori algorithm [10]. We mined projects con- Project # Classes # LOC taining code similar to the one shown in Figure 1(b). We first https://github.com/MarcinCz/MilkaRecognizer 23 751 obtained their representation as extension point usage graph https://github.com/enshahar/ScalarTurtle 13 549 https://github.com/Sciss/ScalaInterpreterPane 24 1473 as in Figure 5. Afterwards, we used the mining algorithm to https://github.com/myrjola/scaltris 10 455 obtain the pattern shown in Figure 6. Note that the code in https://github.com/gabysbrain/scala-swing-jogl-demo 3 266 https://github.com/amsterdam-scala/AS-Tiles-puzzle-solver 6 614 Figure 1(b) correctly matches this pattern. https://github.com/Sciss/ScalaColliderSwing 30 3980 https://github.com/junxiaosong/AlphaZeroGomoku 19 1143 https://github.com/scala/scala-swing/tree/2.0.x/examples/src/main 17 279 Visualization Component. The visualization component is used to configure the tool, and to browse through and To this end, we collected a dataset of 9 open-source SCALA - inspect its results. The developer can select the Scala projects SWING projects hosted on Github. The characteristics of each to import through the S OURCE C ODE I MPORTER. Then, the project are reported in Table I. Even if the corpus is relatively E XTENSION P OINT PATTERNS M INER analyses the projects to small, a manual inspection of the extension points is too discover extension point usages of a specific framework (given time-consuming and error-prone. For that reason, we used the as input). After the computation, the tool displays a list-view framework described in Section III. We manually determined and a graph-view of the extension point usage graphs built the precision of SCALA - XP - MINER. We did this by manually by the importer and that constitute the input to the mining inspecting every extension point graph. We found that 16 out algorithm. Finally, the tool supports inspecting the frequent of 440 nodes where generated incorrectly. We believe there extension point patterns uncovered by the mining algorithm, are multiple reasons for this, but the most common one is due and comparing them to their matches among the input data, the fact that we cannot resolve the type of expressions that to corroborate the validity of the mining process. use type parameters (e.g., we are not able to resolve Int as 1 2 Source Code Extension Point Importer Patterns Miner Source Code Extension Point Graphs 3 Extension Point Patterns Visualisation Tool Fig. 4: Overview of the approach TABLE II: Top 5 most frequent Extension Patterns Reactor # FrameworkClass.Method Parameters Arguments Frequency method_call Constraints Enumeration 1 BorderPanel.add() 16/92 (17.30%) listenTo() Component - 2 Reactor.listenTo() Publisher Publisher 8/92 (8.60%) parameter 3 Reactor.listenTo() Publisher Button 7/92 (7.60%) Constraints Constraints Publisher 4 GridBagPanel.add() Component - 6/92 (6.50%) argument Constraints Constraints 5 GridBagPanel.add() 5/92 (5.40%) Component Label Button Fig. 5: Example of an EXTENSION POINT USAGE GRAPH for it is common to extend the framework at the Reactor class the code in Figure 1(b) using its listenTo method, passing an instance of Button as argument. We use - to indicate that there is no explicit method_call parameter argument information about the argument type (i.e., patterns #1 and #4). A Reactor listenTo Publisher Button method_call parameter argument V. C URRENT L IMITATIONS AND C HALLENGES Fig. 6: EXTENSION POINT PATTERN extracted from Figure 5 B Reactor listenTo Publisher TextField In this paper, we have shown how data mining can be used to discover extension point patterns in projects that use the result type of the method List[+A]#head: A where the SCALA - SWING framework. These patterns provide useful List[+A] was instantiated to List[Int]). The precision information to developers that want to extend the behaviour of our tool is therefore 96%, which we deem sufficiently of a framework. We used the approach and the concepts precise to use this tool as a foundation for our study. of extension points defined by Asaduzzaman et al. [9] as foundation for this work. We applied this approach to the B. Results Scala programming language to find extension patterns of the We were able to collect 92 extension point usages across SCALA - SWING framework. 9 projects using SCALA - XP - MINER. These extension point Our initial results show that, for the SCALA - SWING usages are represented by extension point usage graphs as framework, the use of extension points to modify defined by [9]. On average, there are 10 extension point usages its behaviour is not very widespread. On average, per project. We found that the method Reactor#listenTo there are only 10 extension point usages per project and the method BorderPanel#add are the most frequently and both scala.swing.Reactor#listenTo and used extension points, in 40% and 17% of the cases respec- scala.swing.BorderPanel#add methods are the tively. most frequently occurring ones. Moreover, we found that all To answer RQ2 , we applied the mining algorithm on the set extension patterns are of the category S IMPLE. of 92 extension point usages and obtained 34 patterns which Our future work includes (i) a better evaluation technique range in size from 2 to 6. Next, we categorized each extension to assess precision and recall of our approach, (ii) a large pattern according to the taxonomy as defined by [9]. We found empirical study on a corpus of Scala projects, (iii) improved that all of the mined extension patterns are instances of the type resolution for complex expressions in the importer, (iv) category S IMPLE. the definition of Scala-specific extension point categories, and The top-5 most-occurring extension patterns are shown in (v) the pruning of uninteresting and redundant patterns from Table II. For example, the third pattern (Figure 6) indicates that the results of the miner. R EFERENCES specifications for java classes,” ACM SIGPLAN Notices, vol. 40, no. 1, pp. 98–109, 2005. [1] M. P. Robillard, “What makes apis hard to learn? answers from devel- [7] H. Zhong, T. Xie, L. Zhang, J. Pei, and H. Mei, “Mapo: Mining and opers,” IEEE software, vol. 26, no. 6, pp. 27–34, 2009. recommending api usage patterns,” in Proceedings of the 23rd European [2] M. P. Robillard, E. Bodden, D. Kawrykow, M. Mezini, and T. Ratchford, Conference on Object-Oriented Programming (ECOOP09), 2009, pp. “Automated api property inference techniques,” IEEE Transactions on 318–343. Software Engineering, vol. 39, no. 5, pp. 613–637, 2013. [8] T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. M. Al-Kofahi, and T. N. [3] C. De Roover, R. Lämmel, and E. Pek, “Multi-dimensional exploration Nguyen, “Graph-based mining of multiple object usage patterns,” in of api usage,” in Proceedings of the 21st IEEE International Conference Proceedings of the the 7th joint meeting of the European software on Program Comprehension (ICPC13), 2013. engineering conference and the ACM SIGSOFT symposium on The [4] D. Mandelin, L. Xu, R. Bodı́k, and D. Kimelman, “Jungloid mining: foundations of software engineering. ACM, 2009, pp. 383–392. Helping to navigate the api jungle,” in Proceedings of the ACM [9] M. Asaduzzaman, C. K. Roy, K. A. Schneider, and D. Hou, “Recom- SIGPLAN 2005 Conference on Programming Language Design and mending framework extension examples,” in Software Maintenance and Implementation (PLDI 2005), 2005. Evolution (ICSME), 2017 IEEE International Conference on. IEEE, [5] S. Thummalapenta and T. Xie, “Parseweb: a programmer assistant for 2017, pp. 456–466. reusing open source code on the web,” in Proceedings of the twenty- [10] C. C. Aggarwal, Data mining: the textbook. Springer, 2015. second IEEE/ACM international conference on Automated software [11] C. Wohlin, M. Höst, and K. Henningsson, “Empirical research methods engineering. ACM, 2007, pp. 204–213. in software engineering,” in Empirical methods and studies in software [6] R. Alur, P. Černỳ, P. Madhusudan, and W. Nam, “Synthesis of interface engineering. Springer, 2003, pp. 7–23.