=Paper= {{Paper |id=Vol-1519/paper2 |storemode=property |title=Automatic Recommendation of Software Design Patterns Using Anti-patterns in the Design Phase: A Case Study on Abstract Factory |pdfUrl=https://ceur-ws.org/Vol-1519/paper2.pdf |volume=Vol-1519 |dblpUrl=https://dblp.org/rec/conf/apsec/NaharS15 }} ==Automatic Recommendation of Software Design Patterns Using Anti-patterns in the Design Phase: A Case Study on Abstract Factory== https://ceur-ws.org/Vol-1519/paper2.pdf
    Automatic Recommendation of Software Design
    Patterns Using Anti-patterns in the Design Phase:
            A Case Study on Abstract Factory

                                                    Nadia Nahar∗ and Kazi Sakib†
                            Institute of Information Technology, University of Dhaka, Dhaka, Bangladesh
                                              ∗ bit0327@iit.du.ac.bd, † sakib@iit.du.ac.bd


     Abstract—Anti-patterns, one of the reasons for software design     of anti-patterns of particular design patterns is conducted in
problems, can be solved by applying proper design patterns. If          the first phase. For capturing the full anti-pattern information
anti-patterns are discovered in the design phase, this should lead      i.e. class structure, interactions, and linguistic relationships, the
an early pattern recommendation by using relationships between          analysis is performed in three levels - structural, behavioral and
anti- and design patterns. This paper presents an idea called Anti-     semantic analysis. In the second phase, the inputted system is
pattern based Design Pattern Recommender (ADPR), that uses
design diagrams i.e. class and sequence diagrams to detect anti-
                                                                        matched with those anti-patterns for recommending the related
patterns and recommend corresponding design patterns. First of          design patterns. This matching is also conducted in three
all, anti-patterns relating to specific design patterns are analyzed.   levels similar as the levels of analysis - structural, behavioral
Those anti-patterns are detected in the faulty software design to       and semantic matching. Based on the matched anti-patterns
identify the required design patterns. For assessment, a case study     from these levels, the corresponding ‘missing [2]’ design
is shown along with the experimental result analysis. Initially,        patterns are recommended. ADPR is initially designed for the
ADPR is prepared for recommendation of the Abstract Factory             recommendation of Abstract Factory as it is one of the most
design pattern only, and compared to an existing code-based             popular patterns, and can be extended to the other patterns.
recommender. The comparative results are promising, as ADPR             Research has been conducted for proposing pattern recom-
was successful for all cases of Abstract Factory.                       mendation systems. However, those cannot provide a good
    Keywords—Software design, design pattern, anti-pattern, design      precision due to the difficulty in logically defining the manual
pattern recommendation, abstract factory                                process of mapping human requirements with design pattern
                                                                        intents. The human requirements i.e. usage scenario, designers’
                                                                        answers to questions or cases residing in the knowledge base
                       I.   I NTRODUCTION                               in Case Based Reasoning (CBR), have been inadequate to
    Design patterns formalize reusable solutions for common             accurately extract the required design patterns because of the
recurring problems, while anti-patterns are outcome of bad              lack of focus on the design problems. Generally, these three
solutions degrading the quality of software. Design patterns            approaches of design pattern recommendation can be found
are often mentioned as double-edged sword, selecting the right          in the literature - textual matching of software usage scenario
pattern can produce good-quality software while selecting a             with design pattern intents [3], [4], [5], question answer session
wrong one (anti-pattern) makes it disastrous [1]. Thus, which           with designers [6], [7], and CBR [8], [9]. The first approach
patterns to use in which situation, is a wise decision to take.         is inefficient to identify probable design problems of software
On the contrary, mapping software usage scenario or user                as scenario does not contain design information. The generic
description with pattern intent is a manual and hectic task.            questions of the second approach focuses more on design
However, this task can be made easier with assistance of                pattern features than design problems of particular software.
pattern recommendation systems.                                         In the third approach, cases of CBR does not store possible
The recommendation of a proper design pattern is yet a faulty           design problems of software. Oppositely, the field of anti-
process due to the difficulties in connecting software infor-           pattern detection identifies bad designs in software, assuring
mation with design pattern intents. The software requirements           that successful detection of anti-patterns is possible [10], [11].
do not contain possible design problems’ indication, making             However, the usage of anti-pattern in the design phase for
it infeasible to identify the required patterns. However, anti-         identifying correct design patterns is yet to be discovered.
patterns can be detected after a faulty design is created from          A case study has been conducted for evaluating the applica-
user requirements. Now, as every design pattern has its own             bility of the proposed approach. The case study is carried on a
context of design problems that it solves and every anti-pattern        badly designed java project requiring Abstract Factory, named
causes specific design problems, a relationship should exist            as P ainter. Based on the step-by-step analysis on the project,
between anti- and design patterns that can be beneficial in             Abstract Factory is recommended by the tool. This case study
pattern recommendation.                                                 justifies the approach that, this recommendation process leads
This paper presents the idea of incorporating anti-pattern detec-       to the correct recommendations.
tion and design pattern recommendation in the software design           The validity of this approach is further justified by experiment-
phase. This idea is encapsulated in a tool named as Anti-               ing ADPR on the case of Abstract Factory design pattern. For
pattern based Design Pattern Recommender (ADPR). The tool               this, the prototype of ADPR was implemented for Abstract
recommends appropriate patterns in two phases. The analysis             Factory using java. Moreover, implementation of a prominent


      3rd International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2015)                                      9
research on source based design pattern recommendation,              is a code-level design pattern recommendation approach [12],
proposed by Smith et al. [12], was also performed for the            where patterns are recommended dynamically during the code
comparison. The dataset were created by gathering projects           development phase. That research tried to relate anti-patterns
that require Abstract Factory, but intentionally has not been        with design patterns for recommendation. Anti-patterns were
applied. The results are encouraging as ADPR provides better         identified using structural and behavioral matching in the code,
recommendation results in the design phase of software, com-         and required design patterns to mitigate those anti-patterns
pared to the source based one operating in the coding phase.         were recommended. However, design pattern recommendation
                                                                     in the coding phase is too late as the software has already been
                     II.   R ELATED W ORK                            designed and needed to be changed after the recommendation.
    In terms of recommending suitable patterns for software,         B. Anti-pattern Detection
the relationship establishment between the design pattern and
anti-pattern is rare in the literature. Yet investigations have          Anti-pattern detection is a rich area of research, that
been conducted for proposing design pattern recommendation           focuses on finding bad designs in software [15], [16],
approaches from different perspectives as mentioned below.           [17], [18]. Fourati et al. proposed an anti-pattern detection
On the other hand, anti-pattern detection is a well-established      approach in design level using UML diagrams i.e. the class
research trend for successfully identifying anti-patterns to         and sequence diagrams [10]. The detection was done based
check whether the software design is bad.                            on some predefined threshold values of metrics, identified
                                                                     through structural, behavioral and semantic analysis. This
A. Design Pattern Recommendation                                     prominent research assures that anti-pattern detection can
                                                                     be performed in the design phase. Another approach for
    As mentioned earlier, design pattern recommendation re-          anti-pattern detection was based on Support Vector Machines
searches can be divided into three types – text-based search,        (SVM) [11], where the detection task was accomplished in
question-answer session, and CBR. In text-based search, pat-         three steps - metric specification, SVM classifier training
tern intents are matched with the problem scenarios for iden-        and detection of anti-pattern occurrences. The concept of
tifying the design patterns that relate mostly to the software       anti-pattern training has made any defined or newly defined
[3], [4], [5]. This intent matching is based on set of important     anti-patterns detection possible, breaking the boundary of
words [3], text classification [4], or query text search using       only detection of some well-established anti-patterns (e.g.
Information Retrieval (IR) techniques [5]. However, problem          Blob, Lava Flow, Poltergeists, etc.) [19].
scenarios are ambiguous as written in human language; and are
usually not written from a designer’s point of view, making it       As presented in subsection II-A, the existing approaches
impractical to identify possible design problems.                    of design pattern recommendation in design phase use textual
In question-answer based approach, designers are asked to            match with usage scenario, case match with knowledge base
answer some questions about the software and those answers           cases, or ask design pattern related generic questions to
lead to find the required patterns for that software [6], [7].       designers. These approaches cannot be the proper ways to
Here, the mapping from question-answers to design patterns           recommend design patterns, as design patterns are used for
is set by formulating Goal-Question-Metric (GQM) model [6],          mitigating design problems, and these do not focus on the
or ontology-based techniques [7]. The problem is that, the           system design problems. The single paper that focuses on
questions are often static or generic, and more related to design    design problems (anti-patterns), recommends design patterns
pattern features than software specific design problems.             in the coding phase, making its usage impractical.
In CBR, recommendations are given according to the previous
experiences of pattern usage stored in a knowledge base in                         III.   T HE P ROPOSED A PPROACH
the form of cases [8], [9]. The retrieval of cases from the
knowledge base is performed either using user provided class             The novelty of this research lies in identifying design
diagrams [8], or using inputted and reformulated problem             problems of software for recommending appropriate design
descriptions [9]. Matching cases to identify required patterns       patterns, and in the design phase of software. Without having
are not feasible, as the cases do not focus on the design            the analysis of bad designs (i.e. anti-patterns), suggesting cor-
problems a software might have.                                      rect design patterns is difficult. So, an idea is formalized, where
A few researches were conducted for recommending patterns            the appropriate design patterns are suggested from identifying
which do not fall in any of the mentioned categories. Navarro et     existing design problems, that reside as anti-patterns in the
al. proposed a different recommendation system for suggesting        initial system design.
additional patterns to the designer while a collection of patterns
are already selected [13]. Thus, it may not be used for new          A. Overview of ADPR
software being developed. Kampffmeyer et al. presented a new              Existence of an anti-pattern in a software design discloses
ontology based formalization of the design patterns’ intents         that the design is not appropriate; the design can be improved
making those focus on the problems rather than the solution          by application of suitable design patterns. Thus, the detection
structures [14]. However, the problem predicate and concept          of anti-patterns can lead to the recommendation of design
constraints, required by the recommendation tool, makes it’s         patterns, if the anti-patterns could properly be mapped to their
usage challenging. Both of these approaches require expertize        related design patterns.
of the designers to use those effectively.                           This idea is implemented as a system called Anti-pattern
The research question of this paper is to use anti-pattern           based Design Pattern Recommender (ADPR), which is ini-
knowledge for design pattern recommendation in the design-           tially designed for Abstract Factory design pattern. The top-
phase of software. The most related paper of this research           level overview of ADPR is shown in Fig. 1. There are two


      3rd International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2015)                                10
                                                               Fig. 1: Overview of ADPR

phases in the approach. At first the system analyzes the anti-              ConcreteP roductA1 (ConcP rodA1), ConcreteP roductB1
patterns of particular design patterns. These anti-patterns do              (ConcP rodB1), and ConcreteP roductA2 (ConcP rodA2),
not necessarily be in the anti-patterns catalog like Blob, Lava             ConcreteP roductB2 (ConcP rodB2). As determined by
Flow, etc1 . These represent the ’missing’ design patterns [2]              GoF, instead of being directly instantiated by the Client, these
and their presence indicate that, a particular design pattern               families should have been instantiated using abstract factories;
should have been used [20], [2], [12]. As shown in Fig. 1, in               this encourages the usage of Abstract Factory design pattern2
the second phase, the analyzed anti-patterns are detected in the            in this case. Similarly in 2(b), P roductA1, P roductB1,
initial system design and the corresponding design patterns to              and P roductA2, P roductB2 are two families of classes,
those matched anti-patterns are recommended. The detail of                  which should not be directly instantiated by the Client. Thus,
both these phases are described below.                                      these two class designs represent the anti-patterns of Abstract
                                                                            Factory [2], [21].
                                                                            These anti-patterns are analyzed and stored in the tool for
                                                                            further design level matching. Three levels of analysis are
                                                                            performed for ensuring the accurate capture of anti-pattern
                                                                            information - structural, behavioral and semantic (as shown
                                                                            in Fig. 1 ‘Anti-pattern Analysis’ phase), similar to the design
                                                                            pattern analysis in [22].
                                                                            The structural analysis concentrates on the structural character-
                         (a) As Mentioned in [21]                           istics of the anti-patterns. Similar structures of different anti-
                                                                            patterns can be found making this level of analysis inadequate.
                                                                            Thus, the behavioral analysis is provided for considering the
                                                                            behaviors of the anti-patterns along with the structure. One
                                                                            more level of validation is provided by the semantic analysis,
                                                                            as there can be cases where both structures and behaviors of
                                                                            different anti-patterns may match. Thus, these three levels of
                                                                            analysis ensure the proper refinement of the tool for detection
                          (b) As Mentioned in [2]                           of anti-patterns accurately.
        Fig. 2: Anti-pattern Variants (Abstract Factory)                        Structural Analysis: The structure of an anti-pattern is
                                                                            defined by the relationships among the classes of it. Thus,
B. Analysis of Anti-patterns                                                class diagrams are used in this level [23] (as shown in
                                                                            Fig. 1, ‘Anti-pattern Class Diagrams’ are inputted to ‘Extract
    To identify the missing design patterns, the related anti-              Structural Info’), as those capture the different class-to-class
patterns are collected and analyzed first. The case of Abstract             relationships e.g. aggregation, generalization, association, etc.
Factory is presented here as the usage example. Several anti-               For keeping these relationship information, the structures are
pattern variants of Abstract Factory may exist; initially, two of           represented and stored in a form of n × n matrix of prime
those are used (Fig. 2 [2], [21]) to show whether the proposed              numbers as noted by Dong et al.[22] (for tracking cardinality
system works. In Fig. 2(a), there are two families of classes,
                                                                               2 Abstract Factory intent: “Provide an interface for creating families of
  1 “Anti Patterns Catalog,” http://c2.com/cgi/wiki?AntiPatternsCatalog     related or dependent objects without specifying their concrete classes.” [20]


      3rd International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2015)                                               11
of the relationships). Hence, this level takes the UML class            The behavioral feature of Abstract Factory is, there are families
information of anti-patterns as input and stores those in the           of classes, and these families are always used together [20].
form of matrices. For this, the class diagrams are converted to         Whenever such families of classes are found, that are always
program readable format, XML and inputted to the tool.                  instantiated in the same execution path, and the classes of
In case of Abstract Factory, the class XMLs of the collected            different families are instantiated in different execution paths,
anti-pattern variants are provided to the analyzer, that creates        that system is required to use Abstract Factory [20].
and stores the structure matrices for each of the variants as
shown in Fig. 3. The first matrix of Fig. 3 is generated from               Semantic Analysis: Semantic features of a system capture
Fig. 2(a). Here,                                                        the logical relationships between classes (e.g. same types of
                                                                        classes in a system, classes that are always used together,
   •     C, A1, B1, A2 and B2 represent Client,                         etc.). Semantics basically relate the structural and behavioral
         ConcP rodA1,    ConcP rodB1,  ConcP rodA2                      aspects of the system (information of static structure with
         and ConcP rodB2 respectively.                                  dynamic behavior). The semantic features of anti-patterns are
                                       A                                also assumed to be the same as corresponding design patterns,
   •     The four association (−     →) relations between               as the logical relations among classes should not be changed,
                 A                         A
         Client −→ ConcP rodA1, Client −   → ConcP rodB1,               no matter how the system is being designed. Thus, similar as
                 A                          A
         Client −→ ConcP rodA2, Client −    → ConcP rodB2               the behavioral analysis, related design patterns of anti-patterns
         in 2(a) are contained in the matrix using the prime            are analyzed for capturing semantic information as shown in
         number ‘2’3 .                                                  Fig. 1, ‘Related Design Pattern’ to ‘Analyze Semantic Info’.
                                                                        In Abstract Factory, classes of similar types form different fam-
Similarly, the second matrix of Fig. 3 is generated from 2(b),          ilies [20]. Therefore, the verification of behaviorally matched
where,                                                                  families are done by checking the types of the classes (identi-
                                                                        fied from static structure) in families. Super-class information
   •     AbsA, A1, A2, AbsB, B1, B2, C represent                        are used for this purpose, as classes having the same super-
         AbstractP roductA,      P roductA1, P roductA2,                classes are generally of similar types; but there can be cases
         AbstractP roductB,      P roductB1, P roductB2,                like Fig. 2 (a), where the design is bad enough to not even
         Client correspondingly.                                        follow that OO convention. For those cases, similarity in the
                                                G                       names of classes can give an indication of similar types.
   •     The       four    generalized    (−
                                           →)       relations
                              G
         (P roductA1          −
                              →         AbstractP roductA,
                              G
         P roductA2          −→         AbstractP roductA,              C. Detection and Recommendation
                              G
         P roductB1          −→         AbstractP roductB,
                        G                                                   Once the anti-patterns are analyzed based on corresponding
         P roductB2 −   → AbstractP roductB) and two                    design patterns, those could be detected in a faulty system
                                          A
         association relations (Client −  → P roductA1,                 design for recommending the patterns. Detection of anti-
                   A
         Client −  → P roductB1) are stored in the matrix               patterns needs three levels of matching similar to the analysis
         using prime number ‘3’ and ‘2’ consequently3 .                 - structural, behavioral and semantic matchings (as shown in
                                                                        Fig. 1 ‘Detection & Recommendation’ phase). If a system
                                                                        design is matched with an anti-pattern completely (structurally,
                                                                        behaviorally and semantically), only then the corresponding
                                                                        design pattern is recommended.
                                                                            Structural Matching: The system structure is represented
                                                                        similarly as the matrix of anti-patterns using the system
                                                                        class diagram. The stored anti-patterns’ structures (Fig. 3)
                                                                        are matched to the system’s structure for finding whether
                                                                        any of those anti-patterns is present in the system (Fig. 1,
                                                                        from ‘System Class Diagram’ to ‘Extract and Match Structural
              Fig. 3: Generated Matrices of Fig. 2                      Info’). For this, the system matrix is matched with anti-
                                                                        patterns’ matrices using naive approach, as the focus is on the
                                                                        accuracy rather the computational complexity or time. In this
    Behavioral Analysis: Behaviors of a system represent the            approach, matrices are matched using a brute force method
dynamic characteristics (e.g. class execution sequence in run-          where every permutation of the system matrix (permutation
time) of it. Now, it is logical to assume that the behaviors            of nodes in the system graph) are taken and matched with
of a design pattern are inherited by it’s anti-patterns, as             the anti-pattern matrices. If no match is found, the detection
the anti-patterns provide bad software structures compared to           is stopped and the other levels of matching are postponed.
that pattern, but preserve the software behaviors. Thus, in             Otherwise, for at least one structural match, the behavioral
behavioral analysis, the behaviors of the corresponding design          matching is executed.
patterns of anti-patterns are analyzed (Fig. 1, ‘Related Design
Pattern’ leads to ‘Analyze Behavioral Info’).                               Behavioral Matching: Sequence diagrams are used in this
                                                                        level as those represent the dynamic interactions of classes in
 3 The determined prime number value of Association           is   2,   execution [23] (Fig. 1, ‘System Sequence Diagrams’ are in-
Generalization is 3, and Aggregation is 5, similar as [12].             putted to ‘Extract and Match Behavioral Info’). The lif elines


       3rd International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2015)                                12
of a sequence diagram are the roles or object instances4 , and                   Algorithm 1 Semantic Matching
represent the classes in the same execution sequence. Thus,                       1: system: System Matrix
families of classes in Abstract Factory are identified from these                 2: cN : System Class Names
lif elines, as classes of same families are supposed to be in                     3: behavioralM etric: Behaviors of Anti-pattern (Sequence
the same execution sequence, and so in the same sequence                               Diagram for Abstract Factory)
diagram lif elines. For this, the UML sequence diagrams of                        4: procedure M ATCH S EMANTIC
the system are converted to XMLs first, and inputted to the                       5:    seqs ← behavioralM etric.sequenceDiagrams
tool. Then, the XMLs are parsed to identify the lif elines and                    6:    size ← seqs.size()
the corresponding classes of those are identified. Thus, the                      7:    seq ← [size][size]
identified classes of each sequence diagram are marked to be                      8:    type[cN.length][cN.length] ← G EN T YPE M ATRIX()
in the same family.                                                               9:    for i ← 0 to size do
                                                                                 10:        for j ← i + 1 to size do
    Semantic Matching: Should a particular design pattern                        11:            C OMPARE S EQ(seqs.get(i), seqs.get(j), i, j)
be recommended, is taken in the semantic matching step. In                       12:        end for
semantic matching for Abstract Factory, types of the classes are                 13:    end for
analyzed to validate the family information acquired from the                    14:    maxM atch ← 0
behavioral matching as per the findings of semantic analysis                     15:    for i ← 0 to size do
(different classes of similar types form different families). A                  16:        for j ← 0 to size do
matrix containing the similar types of classes information is                    17:            if maxM atch < seq[i][j] then
generated using the super-class relations. However, as men-                      18:                maxM atch ← seq[i][j]
tioned earlier, sometimes the class-types could not be identified                19:            end if
due to missing super-classes in a bad design (Fig. 2 (a)). For                   20:        end for
those cases, similarity in the names of the classes are analyzed                 21:    end for
to identify the same types (as shown in Fig. 1, ‘System Class                    22:    return maxM atch
Types Or Naming’ are used to ‘Extract and Match Semantic                         23: end procedure
Info’). The class names are split based on capital letters,                      24: procedure C OMPARE S EQ(s1, s2, p1, p2)
and the parts are matched (For example, ’WoodenDoor’ is                          25:    R EMOVE D UPLICATES(s1, s2)
split to ’Wooden’, ’Door’, and ’GlassDoor’ is split to ’Glass’,                  26:    for i ← 0 to s1.size() do
’Door’, and matched to each other). After the class types are                    27:        for j ← 0 to s2.size() do
determined, the mentioned type matrix is generated. Then, that                   28:            s ← −1, d ← −1
matrix is used to analyze the classes in multiple families to                    29:            for k ← 0 to cN.length do
test whether those are aligned to the assumption of Abstract                     30:                if s1.get(i) = cN.get(k) then
Factory that, multiple families contain similar types of, but                    31:                    s←k
different classes.                                                               32:                end if
Now, if the design is too bad to neither have super-classes nor                  33:                if s2.get(j) = cN.get(k) then
similar names for the same types of classes, the approach will                   34:                    d←k
fail to generate type matrix and so, match semantics. Thus, for                  35:                end if
getting recommendation, the basic design principles should be                    36:                if s! = −1 and d! = −1 then
followed by the designers. The semantic matching algorithm                       37:                    break
is shown in Algorithm 1.                                                         38:                end if
For semantic matching, first of all the type matrix is generated                 39:            end for
(Algorithm 1 Line 8). As mentioned previously, it can be                         40:            if s! = −1 and d! = −1 then
generated from super-class information (generalization rela-                     41:                seq[p1][p2] ← seq[p1][p2] + type[s][d]
tionship) or similar naming of classes. The type matrix is a                     42:                seq[p2][p1] ← seq[p2][p1] + type[s][d]
0,1 matrix, where the same type classes share value 1, and the                   43:            end if
others share value 0. Then, every sequences (class families)                     44:        end for
are compared to each others (Lines 9–13). The procedure                          45:    end for
C OMPARE S EQ is called for this reason. In C OMPARE S EQ,                       46: end procedure
the duplicates in the sequences being compared are removed
in Line 25. Then nested loops are executed for getting the
positions of the classes of the sequences in the type matrix
using the class names list (cN ) (Line 26–39). The value in                      IV.     C ASE S TUDY ON “PAINTER ”, A P ROJECT R EQUIRING
those positions inside the type matrix (0 or 1) is added to the                                      A BSTRACT FACTORY
seq matrix in Lines 41–42. After the calculation of the values
in all the seq positions, maxM atch between the sequences                            For an initial assessment of the competency, ADPR was
are identified in Lines 14–21. This maxM atch is returned as                     used on a sample java project named P ainter (Shown
the score of semantic matching. If the score value is >= 2,                      in Table I). This step-by-step study might increase the
there is a valid semantic match.                                                 understanding of the tool as well as justify the feasibility of
                                                                                 the approach.
   4 R. Perera, “The Basics & the Purpose of Sequence Diagrams -
Part 1,” http://creately.com/blog/diagrams/the-basics-the-purpose-of-sequence-   It is assumed here that, the analysis of anti-patterns
diagrams-part-1/                                                                 have already been performed. And thus, the tool has stored


       3rd International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2015)                                       13
the required anti-patterns’ information for the purpose of
detecting those and recommending the corresponding design
patterns for the inputted systems.


A. About P ainter
    The project, P ainter is a well-known example of Abstract
Factory usage5 . For testing the recommendation tool, the
project is designed without implementing Abstract Factory
(badly designed). The scenario of the project is as follows:                               Fig. 5: Class Relation Matrix of P ainter
“The P aint can draw three types of Shape - Circle,
T riangle, or Square. The Shapes can be filled with three
Colors - Red, Blue, or Green. Circles will be Red,                               C. Behavioral Matching of P ainter
T riangles will be Blue, and Squares will be Green.”
                                                                                     For behavioral matching, the information about the interac-
                                                                                 tions between classes in execution is required. This information
B. Structural Matching of P ainter                                               is extracted from the sequence diagrams. From the scenario of
                                                                                 P ainter, three sequence diagrams can be drawn (Fig. 6).
    As mentioned in ‘Structural Matching’ in subsection III-C,
the system structure is to be matched with the anti-patterns’
structure. For this, the initial class diagram of P ainter, shown
in Fig. 4, is inputted into the tool in XML format. This
inputted XML is converted into a matrix of prime numbers
for preserving the relationships between the classes (as in-
structed in [22]), as shown in Fig. 5. There are six association
           A                       A                     A
(P aint −  → Blue, P aint −        → Green, P aint −     → Red,
        A                        A                     A
P aint −→ Square, P aint −      → T riangle, P aint −  → Circle)
                                   G                   G
and six generalization ((Blue −    → IColor, Green −   → IColor,
      G                       G                       G
Red − → IColor, Square −      → IShape, T riangle −   → IShape,                                          (a) Circle Is Red
         G
Circle − → IShape)) relationships in the diagram. These are
fully preserved by putting value ‘2’ in places of association
and ‘3’ in places of generalization3 .




                                                                                                        (b) Triangle Is Blue




                 Fig. 4: Class Diagram of P ainter


The anti-patterns’ structures are assumed to be stored in the                                           (c) Square Is Green
tool. Now, the structures of those stored anti-patterns are                                 Fig. 6: Sequence Diagrams of P ainter
matched with the P ainter matrix using naive matrix matching.
From Fig. 4 and Fig. 2 (a), a match is encountered. Thus, the                    The class families are identified from the lif elines of these
structural matching is accomplished, and the tool will proceed                   sequence diagrams. As, three sequence diagrams are inputted,
to the next level of matching.                                                   three families are identified from those. The first family
                                                                                 consists of P aint, Circle, and Red; the second family has
   5 “Design       Pattern      -      Abstract        Factory       Pattern,”   the classes P aint, T riangle, and Blue; and the third family
http://www.tutorialspoint.com/design pattern/abstract factory pattern.htm        is comprised of P aint, Square, and Green.


       3rd International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2015)                                         14
D. Semantic Matching of P ainter                                            V.     I MPLEMENTATION AND R ESULT A NALYSIS :
                                                                                         FOR A BSTRACT FACTORY
    The three families identified in the behavioral matching
is validated in this level. First of all, the type matrix (as           To assess the new approach, preliminary experiments have
mentioned in subsection III-C ‘Semantic Matching’) is               been conducted on Abstract Factory design pattern. A proto-
generated using the super-class information from the class          type of ADPR has been implemented in java for this purpose.
relation matrix (Fig. 5). The type matrix is shown in Fig. 7.       The existing anti-pattern based pattern recommendation tool
Situations can occur that the super-class information can be        using source code [12] is also implemented for comparative
missing. For example, another variation of bad-designed class       analysis. For the justification of correct recommendations, GoF
diagram can be created by the designer as shown in Fig. 8. It       is followed [20].
is noticeable here that, though the super-classes are missing,
type matrix will still be generated from the similarity in the      A. Environmental Setup
names of the same types of classes. RedColor, BlueColor,
GreenColor; and CircleShape, TriangleShape, SquareShape                 As mentioned earlier, the ADPR prototype has been imple-
are identified as same types. However, if the names of same         mented in java. The equipments, used to develop the prototype
types are not similar in this case, the approach will fail to       are as follows:
generate the type matrix. For example - if the names of the             •        Eclipse Luna (4.4.1): java IDE for ADPR implemen-
classes are similar as Fig. 4, but the super-classes IShape                      tation
and IColor are missing, then the approach will fail.
                                                                        •        StarUML Version-2.1.4: UML editor and XML con-
                                                                                 verter
                                                                    Four cases requiring Abstract Factory according to GoF, have
                                                                    been used as dataset. To test any occurrence of false positive,
                                                                    one project using Template pattern is used. The project source
                                                                    codes and UML diagrams are uploaded on GitHub [24]. The
                                                                    projects are shown in Table I.

                                                                                      TABLE I: Experimented Projects
               Fig. 7: Type Matrix of P ainter
                                                                                              No. of Classes   No. of Sequence
                                                                        Project Name
                                                                                            in Class Diagram      Diagrams
                                                                        CarDriver           8                  2
                                                                        GameScene           10                 2
                                                                        Painter             9                  3
                                                                        MazeGame            12                 2
                                                                        Trip                9                  3

                                                                    Before running ADPR on the sample project set, the XMLs
                                                                    are generated from the UMLs using StarUML to be used as
                                                                    input of the prototype. If the UMLs are not available, those
                                                                    can be produced from source code by reverse engineering in
                                                                    Visual Paradigm, a software design tool.
  Fig. 8: Another Bad Class Diagram Example of P ainter
                                                                    B. Comparative Analysis
After the type matrix is generated, the class families are
analyzed to test whether different classes having the same              For comparative analysis, the projects were run using both
types are situated in different families. Thus, the three           ADPR and the source based tool. The results of the experi-
identified families are analyzed here, and found that all three     mentation are depicted in Table II, which shows that the code-
families contain classes of same types. Circle (family-1),          based tool could detect two missing Abstract Factory patterns
T raiangle (family-2) and Square (family-3) are of the              out of four. This is because, it assumed that the Abstract
same type, and similarly Red (family-1), Blue (family-2)            Factory has a behavioral aspect of having if-else or switch-
and Green (family-3) are also same typed. So, the semantic          case conditions for instantiating the families, which may not
matching ensures that the identified families from the              be always true (for example, class instantiations inside GUI
behavioral matching are valid families.                             onclick listener). On the other hand, ADPR was successful in
                                                                    all cases as the sequence diagrams do not assume the presence
All these three levels of matching indicate that the Abstract       of any conditional operations, rather match the classes in one
Factory design pattern is required to improve the project           execution sequence. Both the tools did not produce any false-
design. Thus, Abstract Factory is recommended for this              positive results.
project. This recommendation is obtained in the design phase        The result identifies the fact that recommendations can be
of the project making it possible to re-design it, and provide      provided based on anti-patterns before the code development
a better design of the system.                                      phase. Recommendation in the design phase gives opportunity


     3rd International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2015)                            15
              TABLE II: Results for Abstract Factory                             [7]   L. Pavlič, V. Podgorelec, and M. Heričko, “A Question-based Design
                                                                                       Pattern Advisement Approach,” Computer Science and Information
                              Recommend Abstract Factory                               Systems, vol. 11, no. 2, pp. 645–664, 2014.
          Project Name
                              Code-Based ADPR GoF                                [8]   P. Gomes, F. C. Pereira, P. Paiva, N. Seco, P. Carreiro, J. L. Ferreira, and
          CarDriver           Yes        Yes       Yes                                 C. Bento, “Using CBR for Automation of Software Design Patterns,”
                                                                                       Advances in Case-Based Reasoning, Springer Berlin Heidelberg, vol.
          GameScene           No         Yes       Yes                                 2416, pp. 534–548, 2002.
          Painter             Yes        Yes       Yes
                                                                                 [9]   W. Muangon and S. Intakosum, “Case-based Reasoning for Design
          MazeGame            No         Yes       Yes                                 Patterns Searching System,” International Journal of Computer Appli-
          Trip                No         No        No                                  cations, vol. 70, no. 26, pp. 16–24, 2013.
                                                                                [10]   R. Fourati, N. Bouassida, and H. B. Abdallah, “A Metric-Based
                                                                                       Approach for Anti-pattern Detection in UML Designs,” Studies in
to correct the design of software which is not feasible in the                         Computational Intelligence, Springer Berlin Heidelberg, vol. 364, pp.
coding phase. Thus, the results of ADPR are encouraging, as                            17–33, 2011.
it could provide correct recommendations in the design phase,                   [11]   A. Maiga, N. Ali, N. Bhattacharya, A. Sabané, Y.-G. Guéhéneuc,
making the re-design of software possible.                                             G. Antoniol, and E. Aı̈meur, “Support Vector Machines for Anti-
                                                                                       pattern Detection,” in Proceedings of the 27th IEEE/ACM International
                                                                                       Conference on Automated Software Engineering (ASE), 2012, pp. 278–
                          VI.    C ONCLUSION                                           281.
    This paper introduces a new idea to recommend design                        [12]   S. Smith and D. R. Plante, “Dynamically Recommending Design
patterns using anti-patterns. A tool is proposed named ADPR,                           Patterns,” in Proceedings of the 24th International Conference on
                                                                                       Software Engineering and Knowledge Engineering (SEKE), 2012, pp.
where anti-pattern detection is utilized for recommendation of                         499–504.
appropriate design patterns in the software design phase.
The recommendation task is executed in two phases; analysis                     [13]   I. Navarro, P. Dı́az, and A. Malizia, “A Recommendation System to
                                                                                       Support Design Patterns Selection,” in Proceedings of the IEEE Sympo-
of anti-patterns is performed in the first phase, and in the                           sium on Visual Languages and Human-Centric Computing (VL/HCC).
next phase, anti-patterns are detected and design patterns are                         IEEE, 2010, pp. 269–270.
recommended. For anti-pattern analysis in the first phase, anti-
                                                                                [14]   H. Kampffmeyer and S. Zschaler, “Finding the Pattern You Need: The
patterns of particular design patterns are collected and analyzed                      Design Pattern Intent Ontology,” Model Driven Engineering Languages
in three levels - structural, behavioral, and semantic. Then                           and Systems, Springer Berlin Heidelberg, vol. 4735, pp. 211–225, 2007.
in the second phase, the identified anti-patterns are matched                   [15]   N. Moha, Y.-G. Gueheneuc, L. Duchien, and A.-F. Le Meur, “Decor: A
with system designs for recommending corresponding design                              Method for the Specification and Detection of Code and Design Smells,”
patterns using the similar three levels of matching.                                   IEEE Transactions on Software Engineering, IEEE, vol. 36, no. 1, pp.
A case study on a sample java project evaluates the appli-                             20–36, 2010.
cability of the approach. The tool was initially implemented                    [16]   T. Feng, J. Zhang, H. Wang, and X. Wang, “Software Design Improve-
for Abstract Factory only. A comparative analysis with an                              ment through Anti-patterns Identification,” in Proceedings of the 20th
existing code based tool showed that, ADPR could correctly                             IEEE International Conference on Software Maintenance. IEEE, 2004,
recommend design patterns in the design phase rather in the                            p. 524.
coding phase.                                                                   [17]   A. Maiga, N. Ali, N. Bhattacharya, A. Sabane, Y.-G. Guéhéneuc, and
As currently the tool is developed for Abstract Factory, the                           E. Aimeur, “SMURF: A SVM-based Incremental Anti-pattern Detection
                                                                                       Approach,” in Proceedings of the 19th Working Conference on Reverse
future direction lies in extending it to the other design patterns                     Engineering (WCRE). IEEE, 2012, pp. 466–475.
incrementally, and generalizing the process.
                                                                                [18]   V. Cortellessa, A. Di Marco, R. Eramo, A. Pierantonio, and C. Trubiani,
                                                                                       “Digging into UML Models to Remove Performance Antipatterns,” in
                              R EFERENCES                                              Proceedings of the 32nd ICSE Workshop on Quantitative Stochastic
 [1]   N. Bautista, “A Beginners Guide to Design Patterns,” http://code.               Models in the Verification and Design of Software Systems. ACM,
       tutsplus.com/articles/a-beginners-guide-to-design-patterns--net-12752,          2010, pp. 9–16.
       accessed: 2015-01-01.                                                    [19]   W. J. Brown, H. W. McCormick, T. J. Mowbray, and R. C. Malveau,
 [2]   C. Jebelean, “Automatic Detection of Missing Abstract-Factory Design            AntiPatterns: Refactoring Software, Architectures, and Projects in Cri-
       Pattern in Object-Oriented Code,” in Proceedings of the International           sis. Wiley New York, 1998.
       Conference on Technical Informatics, 2004.
                                                                                [20]   E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns:
 [3]   Y.-G. Guéhéneuc and R. Mustapha, “A Simple Recommender System                 Elements of Reusable Object-Oriented Software. Pearson Education,
       for Design Patterns,” in Proceedings of the 1st EuroPLoP Focus Group            1994.
       on Pattern Repositories, 2007.
 [4]   S. M. H. Hasheminejad and S. Jalili, “Design Patterns Selection:         [21]   A. Jarvi, “Abstract Factory: 2005,” http://staff.cs.utu.fi/kurssit/
       An Automatic Two-phase Method,” Journal of Systems and Software,                Programming-III/AbstractFactory(10).pdf, accessed: 2015-01-03.
       Elsevier, vol. 85, no. 2, pp. 408–424, 2012.                             [22]   J. Dong, D. S. Lad, and Y. Zhao, “DP-Miner: Design Pattern Discovery
 [5]   S. Suresh, M. Naidu, S. A. Kiran, and P. Tathawade, “Design Pattern             Using Matrix,” in Proceedings of the 14th Annual IEEE International
       Recommendation System: a Methodology, Data Model and Algo-                      Conference and Workshops on Engineering of Computer-Based Systems
       rithms,” in Proceedings of the International Conference on Computa-             (ECBS). IEEE, 2007, pp. 371–380.
       tional Techniques and Artificial Intelligence (ICCTAI), 2011.
                                                                                [23]   H. Zhu and I. Bayley, “An Algebra of Design Patterns,” ACM Trans-
 [6]   F. Palma, H. Farzin, Y.-G. Guéhéneuc, and N. Moha, “Recommen-                 actions on Software Engineering and Methodology (TOSEM), ACM,
       dation System for Design Patterns in Software Development: An                   vol. 22, no. 3, p. 23, 2013.
       DPR Overview,” in Proceedings of the 3rd International Workshop on
       Recommendation Systems for Software Engineering. IEEE, 2012, pp.         [24]   N. Nahar, “NadiaIT/ADPR-dataset: 2015,” https://github.com/NadiaIT/
       1–5.                                                                            ADPR-dataset, accessed: 2015-06-05.



       3rd International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2015)                                                       16