=Paper= {{Paper |id=Vol-2689/paper2 |storemode=property |title=Comparison of Occurrence of Design Smells in Desktop and Mobile Applications |pdfUrl=https://ceur-ws.org/Vol-2689/paper2.pdf |volume=Vol-2689 |authors=Daniel Ogenrwot,Joyce Nakatumba-Nabende,Michel R.V. Chaudron |dblpUrl=https://dblp.org/rec/conf/seia-ws/OgenrwotNC20 }} ==Comparison of Occurrence of Design Smells in Desktop and Mobile Applications== https://ceur-ws.org/Vol-2689/paper2.pdf
      Comparison of Occurrence of Design Smells in
           Desktop and Mobile Applications
          Daniel Ogenrwot                    Joyce Nakatumba-Nabende                                 Michel R.V. Chaudron
 Department of Computer Science Department of Computer Science Department of Computer Science and Engineering
         Gulu University              Makerere University             Chalmers | Gothenburg University
         Gulu, Uganda                  Kampala, Uganda                       Gothenburg, Sweden
      d.ogenrwot@gu.ac.ug          jnakatumba@cis.mak.ac.ug                 chaudron@chalmers.se



    Abstract—Design smells are symptoms of poor solutions to                  The need for effective detection of anti-patterns has attracted
recurring design problems in a software system. Those symptoms             a lot of research interests, both from academic and software
have a direct negative impact on software quality by making                industry. As such, significant studies towards the realization of
it difficult to comprehend and maintain. In this paper we
compare the occurrence of design smells between different                  effective design smell detection methods and tools have been
technological ecosystems: windows/desktop and android/mobile.              conducted over the last few years [3], [5], [6]. Software metrics
This knowledge is significant for various software maintenance             provide the backbone for most anti-pattern detection platforms
activities such as program quality assurance and refactoring.              and approaches. They are applied to evaluate the internal code
To supplement previous findings, our study aimed at (a) under-             quality and productivity, as well as maintainability of software.
standing if and how the relationship among design smells differs
across windows and mobile applications and (b) determining the             For example; Imran [3] used an unsupervised spectral cluster-
groups of design smells that tend to occur frequently together             ing tool guided by software metrics to detect design smells
and the magnitude of their occurrence in windows and mobile                across 3,306 classes of real-life open-source Java software.
applications. In this study, we explored the use of statistics and            Design smells are detected using a combination of software
unsupervised learning on a dataset consisting of twelve (12) Java-
based open-source projects mined from GitHub. We identified
                                                                           metrics such as Depth of Inheritance Tree (DIT), Weighted
fifteen (15) most frequent design smells across desktop and                Methods per Class (WMC) and/or Coupling Between Objects
mobile applications. Additionally, a clustering technique revealed         (CBO) among others. Understanding the diversity, distribution,
which groups of design smells that often co-occur. Specifically,           magnitude and co-occurrence of various design smells within a
{SpeculativeGenerality, SwissArmyKnife} and {LongParameterList,            source code could provide a good opportunity for optimizing
ClassDataShouldBePrivate} are observed to occur frequently
together in desktop and mobile applications.
                                                                           these metrics and consequently improving rule-based design
    Index Terms—Design Smells, Clustering, Software Quality,               smells detection approaches. Moreover, this knowledge is
Anti-patterns                                                              essential to developers in implementing various design smell
                                                                           control and prevention mechanisms as well as guiding software
                        I. I NTRODUCTION                                   maintenance activities.
   The concept of “design smell” was introduced by Fowler                     It is upon this motivation that we propose a method based
[1] and defined as symptoms of poor solutions to recurring                 on statistics and machine learning to understand the diversity,
design problems. The symptoms of design smells in a software               distribution, magnitude and co-occurrence of design smells
system are also referred to as anti-patterns. Design smells                across desktop and mobile applications. We identified and an-
are normally introduced in source code by developers during                alyzed fifteen (15) most frequent design smells across twelve
their daily activities such as the implementation of user                  (12) open-source object-oriented Java projects extracted from
requirements, developing important patches or during a “hack”              GitHub. This paper makes the following contributions:
or “workaround” to obtain a sub-optimal solution to existing                  1) We present a comparison of occurrence of design smells
problems [2]. According to previous studies, the existence                       in desktop and mobile application with a focus on di-
of design smells makes programs complex to comprehend,                           versity, distribution, magnitude and co-occurrence using
summarize and modify [3], which poses a direct negative effect                   a combination of heuristics, statistics and unsupervised
on software quality [3], [4]. The existence of code smells in                    learning techniques.
any code-base calls for refactoring, which is a technique of                  2) We show that the aforementioned clustering approach
restructuring a program without changing its external behavior                   can be leveraged to expose hidden and non innate
to ensure that any further development is possible. However,                     relationships among design smells.
the cost of refactoring becomes expensive in terms of time                    3) We discuss the implications of our study for researchers,
and resources, especially for today’s ever-evolving software                     software industry and on the development of design
platforms.                                                                       smell detection tools.
  This work was supported by Mak-Sida Project 381.                            The rest of the paper is structured as follows: In section



Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
II we explain key concepts of design smells, their effect on       using ensemble machine learning method called SMart Ag-
software maintainability and discuss related work. Next, in        gregation of Anti-patterns Detectors (SMAD) to detect anti-
section III, we provide a detailed explanation of the dataset      patterns. SMAD was designed by intergrating several detection
and methods used to conduct our study. Then, we present the        tools based on their internal detection rules to produce an
results and discussions in section IV. We end with discussing      improved prediction from a reasonable number of training
implications in V, conclusions in VII and direction of future      examples. SMAD significantly outperformed other ensemble
work.                                                              methods especially for the detection of two well-known anti-
                                                                   patterns i.e. God Class and Feature Envy in eight Java projects.
                     II. R ELATED W ORK
                                                                      Although machine learning-based smell detection has
A. Design Smells                                                   demonstrated a lot of potentials as recorded by different
   Design smells, also known as “anti-patterns" [7], “code         authors, it is heavily dependent on a large amount of training
smells" [8] or “bad smells", are indicators of issues in source    dataset and availability of such dataset is still a huge concern.
code that can negatively affect maintainability of a software         3) Optimization: Most studies in this category utilize op-
system as well as various programming activities [9]. Techni-      timization algorithms such as genetic algorithms to detect
cally, code smells do not stop the system from functioning but     anti-patterns in a software system. Saranya et al. [5] applied
can easily affect the development process, weaken the sustain-     a genetic algorithm for model-level code smell detection.
ability of software and increase the probability of its failure    The motivation of their study was based on the limitation
[10]. The existence of design smells in source code calls for      of rule and metrics-based code smell detection approaches.
code refactoring which is a common programming task aimed          Specifically, defining the rules and identifying the correct
at improving the internal structure of existing software code      threshold value in rule-based code smells detection is a tedious
without affecting its observable behavior. It greatly enhance      task and normally achieved through trial and error method. To
software maintainability and generate a more manageable            address this issue, this work introduced a Euclidean distance-
internal architecture [10]. An earlier study by Fowler [1]         based Genetic Algorithm and Particle Swarm Optimization
outlines 22 design smells and their corresponding refactoring      (EGAPSO). The result of EGAPSO proves to be effective
techniques. Those design smells are programming language-          when compared to other code smell detection approaches such
independent but mostly targets object-oriented paradigm.           as DEtection & CORrection (DECOR) [16].

B. Design Smell Detection Techniques                               C. Impact of Design Smells on Software Maintenance
   The detection of harmful code smells which deteriorate the         It is important to note that code smells are not errors and
software quality has attracted a lot of research interests, both   therefore, they do not prevent the software from functioning. In
from academic and software industry. As such, significant          some circumstances, code smells are introduced by developers
studies towards effective detection of code smells have been       while implementing important patches or developing a “hack"
conducted over the last few years. Sharma and Spinellis [11]       or “workaround" as a sub-optimal solution to existing problem
grouped design smells detection strategies in five broad cate-     [2]. According to previous studies, the existence of design
gories, which include; Metrics-based, Rules/Heuristic-based,       smells makes the program complex to comprehend, summarize
History-based, Machine learning-based and Optimization-            and modify [3], [17], which negatively affect software quality
based smell detection.                                             [4].
   1) Metric-based design smell detection: Metrics-based is           Soh et al. [9] studied the effects of code smells at the
the most common approach of design smells detection [12].          developer’s activity level i.e. code reading, editing, searching,
It is relatively easy to implement and normally follows three      and navigating. The experiment involved six expert developers
generic steps; (1) take source code as input and prepare a         performing maintenance tasks on four Java applications. Each
source code model such as Abstract Syntax Tree (AST). (2)          developer performed two tasks while logs were monitored. An
detect a set of source code metrics that capture the charac-       annotation schema was then defined to identify developers’ ac-
teristics of smells. (3) detect smells by a suitable threshold     tivities and assess whether code smells affect his/her different
value.                                                             maintenance task. Result of their study indicated that code
   2) Machine learning-based design smell detection: Sev-          smells indeed affect the effort of certain kind of activities but
eral authors have applied machine learning in the detection        the effect is contingent on the type of maintenance.
of code/design smells with a common focus on supervised               It is also noted that design smells are associated with the
learning [6], [13]–[15] and cluster-based approaches [3].          occurrence of software bugs, which affect maintenance tasks.
   Liu et al. [14] proposed a deep learning-based approach to      Cairo et al. [18] carried out a systematic literature review
detect Feature Envy. Their approach relies on both structural      on published studies that provide evidence of the influence
and lexical information. Labeled samples were automatically        of code smells on the occurrence of software bugs. In their
generated from open-source applications. The gold dataset          study, 24 code smells were found to be more influential in the
is fed into two convolutional neural network layers and one        occurrence of bugs. Specifically, God Class, Shotgun Surgery
fully-connected layer to perform classification. More recently,    and God Method were found to be significant contributors and
Barbez et al. [6] extended the approaches in [13] and [14]         positively associated with error proneness.
D. Design Smells in Desktop and Mobile Applications                  • All projects rely on the Java Swing library for GUI
   A few previous papers carried out work closely related to the       design.
                                                                     • Cross-platform compatibility (i.e. can function on both
study of design smells in desktop and mobile applications. To
begin with, Mannan et al. [19] conducted a study to understand         Windows, MacOS and Linux operation system) and de-
code smells in android applications and how they compare               pended on Java Core libraries for the design of its back-
with desktop application. Their study involved a large corpus          end logic.
of desktop and android application collected from Github.          The selected projects constitute a total of 1,601,369 Java Lines
   Habchi et al. [20] studied code smells in iOS by analyz-        of Code (LoC). We choose Java-based projects because they
ing 279 open-source iOS apps. In this paper, the authors           account for a wide variety of open source projects hosted on
considered the presence of object-oriented and iOS-specific        different code repositories and at the time of this research, Java
code smells by analyzing 279 open-source iOS apps. Source          was among the top 3 popular programming languages within
code was analyzed by extending PAPRIKA toolkit in order to         the software industry. Moreover, Java is used by billions of
accommodate the detection of code smells in Objective-C or         devices across the globe. The details of the selected projects
Swift language. Their observation shows that iOS apps tend         are shown in Table I.
to contain the same proportions of code smells regardless of
                                                                   TABLE I: Projects selected in this study including total LoC
the development language, but they seem to be less prone to
                                                                   and number of Design Smells (#DS) in each codebase.
code smells compared to Android apps.
   Another interesting replication study by Palomba et al.           No.   Domain    Project            Version       #LoC     #DS
[21] focused on investigating code-smells co-occurrence using        1     Desktop   SweetHome3d        5.6         104,059     206
                                                                     2     Desktop   Mars Simulation    3.1.0       255,459     875
association rule. This study was carried out on a dataset            3     Desktop   ArgoUML            0.35.1      177,372   1,160
composed of 395 releases of 30 software systems, capturing           4     Desktop   JEdit              5.5.0       124,164     605
13 code-smells. Their results highlighted some expected re-          5     Desktop   GanttProject       2.9.11       66,709     394
                                                                     6     Mobile    K9 Mail            5.600        93,540     247
lationships but also revealed some co-occurrences missed by          7     Mobile    Bitcoin Wallet     6.31         18,079      50
previous research.                                                   8     Mobile    KeepassDroid       2.5.9        17,916     156
   Despite the results obtained from these study, the following      9     Mobile    Opentrip Planner   2.1.5         9,760      28
                                                                     10    Mobile    Telegram           6.1.1       541,694     540
extensions can be made; The study by Mannan et al. [19]              11    Mobile    Tweet Lanes        1.4.1        25,886     105
focused mainly on code smells which tends to affect read-            12    Mobile    Text Secure        4.69.5      166,731     799
ability and simplicity. The task of refactoring in this case               Total                                  1,601,369      —
involves renaming or extracting to methods. Design smells,
on the other hand, tend to be more subtle. They usually affect
maintainability and flexibility. Next, their study [19] did not    B. Data Preprocessing
look at the general co-occurrence of code smells and how these
                                                                      For the preprocessing task, we passed the project class
co-occurrences varies across desktop and android application.
                                                                   files as input to an anti-pattern detection tool. Particularly,
Palomba et al. [21] focused on general co-occurrence of code
                                                                   we used Pattern Trace Identification, Detection, and Enhance-
smells in Java but not in mobile applications. Moreover, that
                                                                   ment in Java (Ptidej) tool. It is a open source Java-based
projects selected in this study comprise a mixer of Java desk-
                                                                   reverse engineering tool suite that includes several identifi-
top applications and libraries, whose internal implementation
                                                                   cation algorithms for idioms, micro-patterns, design patterns,
slightly differs.
                                                                   and design defects [22]. Using this tool, we were able to
                                                                   detect and select fifteen (15) frequent anti-patterns across the
                             III. M ETHOD
                                                                   selected projects which includes: LongMethod, ComplexClass,
   In this section we discuss the approach taken to conduct        LongParameterList, BaseClassShouldBeAbstract, Speculative-
this study including data collection, preprocessing and analysis   Generality, ClassDataShouldBePrivate, ManyFieldAttributes-
techniques.                                                        ButNotComplex, MessageChain, SpaghettiCode, RefusedPar-
                                                                   entBequest, SwissArmyKnife, Blob, AntiSingleton, LargeClass,
A. The Dataset                                                     LazyClass. Design smells are detected and stored in “.ini"
   Our dataset is based on twelve (12) real-life open-source       files. The file names are tagged with a specific design smell
Java projects mined from GitHub. Seven (7) of the projects         type. For example, in the k-9 mail project, AntiSingleton
are mobile (android-based) applications and the other five (5)     design smell is stored as “DetectionResults in K9 for An-
are desktop applications. The android projects were selected       tiSingleton.ini". Our goal is to extract class names and the
from a list of projects previously studied by Mannan et al. [19]   corresponding design smell detected in that class.
found here1 . For our study, we focused on the latest releases        We apply heuristics to determine the structure and pattern
from GitHub. The selection criteria for desktop applications       of class names in the detected anti-pattern result files for the
was based on two significant characteristics:                      different projects. We use python regular expressions to extract
                                                                   class names and associated them with respective design smell
  1 http://web.engr.oregonstate.edu/ mannanu/AndroidProjects.txt   type. A value of 1 was assigned if a particular anti-pattern is
detected in a class otherwise 0 is assigned. Table II shows a      A. RQ: Does the type of application i.e. desktop or mobile
sample of the final output of the preprocessing tasks.             influence the variations in diversity, distribution and magni-
C. Data Analysis                                                   tude of design smells occurrence? If so, are these variations
                                                                   statistically significant?
   We start by grouping the data to create a collection of
aggregated number of each design smells in specific project.          To answer this research question, we start by discussing
Next, we grouped the data according to whether the code-           some of the key differences and similarities between mobile
base is a mobile (android) app or desktop software as shown        and desktop applications. According to the dataset, both mo-
in Table III.                                                      bile and desktop projects are based on Java programming
D. Clustering                                                      language and fundamentally obey the OOP programming
                                                                   paradigm. However, some key notable differences exist, for
   Clustering is one of the most important concepts for un-        example, desktop applications mostly rely on the Java Swing
supervised learning in machine learning. In this study, we         library for Graphical User Interface (GUI) design while mobile
used Powered Outer Probabilistic Clustering (POPC) [23].           (android) applications are based on XML as their underlying
The choice of this algorithm was based on the following            language for the design of GUI [19]. Table V presents some of
motivation: First, numerous clustering algorithms including        these key notable differences. We believe that these differences
the popular k-means algorithm, require the number of clusters      could influence the diversity, distribution and magnitude of
to be specified in advance which is a huge drawback. Some          design smell occurrence.
studies use the silhouette coefficient, elbow method, and
                                                                      1) Diversity of Design Smells: We found some interesting
other approaches to determine the optimal number of clusters.
                                                                   variations in the variety of design smells that occur in desktop
However, those methods have their limitations, for example:
                                                                   and mobile applications. Generally, we observed that desktop
sometimes the elbow method fails to give a clear “elbow
                                                                   applications have a diverse type of design smells compared
point”. Second, k-means is not very suitable for a binary
                                                                   to mobile applications. For example, looking at Table III,
feature sets.
                                                                   we can observe that desktop applications account for up
   Using POPC, we do not need to specify the number of
                                                                   to 93% of the total type of design smells detected in the
clusters upfront. It tries to mitigate these drawbacks using
                                                                   entire data set whereas mobile applications takes up about
back-propagation. The algorithm is observed to work quite
                                                                   73%. These variations are caused by the following design
well on binary datasets and converges to the expected (optimal)
                                                                   smells: RefusedParentBequest, SpaghettiCode, MessageChains
number of clusters on theoretical examples as elaborated by
                                                                   and SwissArmyKnife that occurred in desktop applications
Taraba [23].
   Based on the processed dataset in Table II, we constructed      only while LargeClass was observed in only the mobile
two different datasets for the task of clustering. The first       applications.
dataset consist of desktop data while the second contain mobile       We think these variations can be explained based on the dif-
data. This was carried out to determine if there were any          ferences in the workflow of android and desktop applications
observable differences in the cluster formation across the         (as highlighted in Table V). Based on these differences, we
datasets. The output of POPC clustering was used to group          expect more types of design smells to occur in desktop than
design smells based on their occurrence in each set of data as     in mobile applications. For example, android applications are
shown in Figure 4, using the following procedure:                  built upon the android frameworks which encapsulates low-
   1) First, we constructed a table of clusters and design         level functionalities of the Android OS. Thus, the developer
      smells for both desktop and android- Table IV.               does not have to implement several classes at UI and activity
                                                                   management level, thereby reducing the possibility of inducing
        • For each cluster (c) of classes, we compute the total
                                                                   design smells
           number of each design smell.
        • If the total number of a given design smells is > 0,
                                                                      2) Distribution and Magnitude of Design Smells: The sec-
           we assign a value 1 in its row, otherwise, we assign    ond part of this research question focus on understanding
           a value 0. The output of this operation is shown in     the distribution and quantity of design smells in desktop
           Table IV.                                               and mobile applications. Figure 1 shows both the distribution
        • We repeat this process for all the clusters.
                                                                   and magnitude of design smells across desktop and mobile
   2) Secondly, we extract the design smell rows and create a      applications. We noticed that design smells are almost always
      binary matrix. This matrix is treated as an n-dimensional    more in desktop applications than in mobile applications. This
      array which is then passed to a dendrogram creation          difference can be observed in Figure 2 where desktop and
      function. We use the python Plotly package which             mobile applications account for 67.5% and 32.5% of the total
      performs hierarchical clustering on data and represents      number of design smells in our corpus respectively. A rather
      the resulting tree.                                          large variation exists in the quantity of Blob and Refused-
                                                                   ParentBequest in desktop and mobile applications. Next, we
                         IV. R ESULTS                              explored the relationship between lines of code (LoC) and
   In this section, we present and discussion the results of our   magnitude of design smells in desktop and mobile application.
study. We answer the following research questions:                 Figure 3 shows a scatter plot of total LoC against magnitude
                                         TABLE II: Sample output of processed design smell files.
 No.     FullClassPath                                                         LongMethod    LazyClass   Blob   ComplexClass   ...
 1       k9mail.src.main.java.com.fsck.k9.controller.SimpleMessagingListener       1             1        1          0         ...
 2       org.thoughtcrime.securesms.mms.AudioSlide                                 1             0        0          0         ...
 3       org.telegram.ui.Components.GroupCreateSpan                                1             0        0          1         ...
 4       org.telegram.ui.Cells.TextSelectionHelper                                 1             0        0          1         ...
 5       com.keepassdroid.database.PwDate                                          1             0        0          1         ...
 6       com.tweetlanes.android.core.view.HomeActivity                             1             0        0          1         ...


TABLE III: Diversity of design smells across desktop and
mobile applications.
   No.     Design Smells                          Desktop     Mobile
   1       LongMethod                             3           3
   2       ComplexClass                           3           3
   3       LongParameterList                      3           3
   4       LazyClass                              3           3
   5       Blob                                   3           3
   6       ClassDataShouldBePrivate               3           3
   7       RefusedParentBequest                   3           7
                                                                               Fig. 2: Aggregated percentages of design smells across desktop
   8       AntiSingleton                          3           3                and mobile applications.
   9       BaseClassShouldBeAbstract              3           3
   10      SpeculativeGenerality                  3           3
   11      SpaghettiCode                          3           7
   12      ManyFieldAttributesButNotComplex       3           3
   13      MessageChains                          3           7
   14      SwissArmyKnife                         3           7
   15      LargeClass                             7           3



for each selected project within the two ecosystems. We found
that there is an observable linear correlation between LoC and
magnitude of design smell. Particularly, design smells tend to
increase proportionally to the increase in project size, except
for a few cases. We think this result is interesting and worth                 Fig. 3: Comparison of the lines of code with magnitude of
further investigation especially using various project releases.               design smells.


                                                                               observe pairs or groups of design smells that often occur
                                                                               together and/or have a similar characteristics in desktop versus
                                                                               mobile applications from the dataset.
                                                                                  The clustering results in Figure 4 reveal groups of design
                                                                               smells that often co-occur in desktop or android applications.
                                                                               Some of the clusters are expected while others are not obvious
                                                                               and call for more study to understand the reason for their
                                                                               appearance and relationships. For example; SpeculativeGen-
                                                                               erality and SwissArmyKinfe both occur in the same cluster
                                                                               for desktop and mobile applications. This is expected because
                                                                               these design smells are theoretically related. Other similar
                                                                               relationships in the clusters produced by desktop and mobile
                                                                               data include: ComplexClass and LongMethod, ManyFieldAt-
                                                                               tributesButNotComplex, MessageChains, SpaghettiCode.
                                                                                  3) Statistical Significance: To determine whether the vari-
                                                                               ations in diversity, distribution and magnitude of design smell
Fig. 1: Comparison of the distribution of design smells in                     across desktop and mobile applications is statistically signif-
desktop and mobile applications.                                               icant, we conducted a statistical test using the Welch’s two-
                                                                               sample t-test. The Welch’s t-test is a less restrictive version
   We also went further to study the distribution of design                    of the student’s t-test and mostly recommended for dealing
smells using POPC clustering algorithm discussed in section                    with data of unequal variance and sample size, while maintain-
III. The motivation was to understand design smell distribution                ing the normality assumption. We choose this particular test
from an unsupervised learning perspective. This way, we can                    because the number of projects we selected for desktop and
  TABLE IV: An example of data constructed from the output of POPC clustering to create the Dendrograms in Figure 4.
                  No.     DS                                 C0    C1     C2    C3   C4    C5    C6   C7       C8   C9
                  1       LongMethod                         0     0      1     0    0     0     0    0        0    0
                  2       ComplexClass                       0     0      0     0    1     0     0    0        0    0
                  3       LongParameterList                  1     1      1     1    0     0     1    0        0    0
                  4       LazyClass                          0     0      0     1    0     0     0    0        0    0
                  5       Blob                               1     1      1     0    0     1     1    1        0    0
                  6       ClassDataShouldBePrivate           1     1      1     0    1     0     1    1        0    1
                  7       RefusedParentBequest               1     1      1     1    0     1     1    0        0    0
                  8       AntiSingleton                      0     0      0     0    0     1     0    0        0    0
                  9       BaseClassShouldBeAbstract          1     1      1     0    0     1     0    0        0    0
                  10      SpeculativeGenerality              0     0      0     0    0     0     0    0        1    0
                  11      SpaghettiCode                      0     0      1     0    1     0     1    0        1    0
                  12      ManyFieldAttributesButNotComplex   1     0      0     1    0     1     1    0        0    0
                  13      MessageChains                      1     1      1     1    1     1     1    1        1    0
                  14      SwissArmyKnife                     0     0      0     0    0     0     0    0        0    0
                  15      LargeClass                         1     1      1     0    0     0     1    0        0    0




                             (a) Desktop                                                          (b) Mobile

                         Fig. 4: Dendrogram showing groups of design smells that co-occur frequently.


mobile applications is not the same. We carried out this test           data which can easily be generalized for both domains. This
using the data in Figure 1 and obtained the following results:          is also backed by our statistical significance test result which
Welch’s test: -1.201, p-value: 0.242. The result indicates              indicates that there is no statistical difference in the number
that, despite the various observable variations in diversity,           of design smells across desktop and mobile applications.
distribution and magnitude of design smell across desktop                  The clustering results in Figure 4. provides some practical
and mobile applications, these variations are not statistically         confirmation of theories related to shared characteristics and
significant. Therefore, we can not conclude that design smells          similarities among design smells. For example, we were able
often occur more frequent in desktop or mobile applications.            to show, using unsupervised learning that Speculative Gen-
This result is also consistent with previous papers focusing on         erality and SwissArmKnife are closely related. However, we
code smells, for example, the paper by Mannan et al. [19].              also found some unexpected relationship/similarities in the
                                                                        clusters which require more research to understand them and
                        V. I MPLICATIONS
                                                                        make recommendations. We, therefore, encourage researchers
   In this section, we present the implications of this study to        to consider exploring this direction in future studies.
the researchers, software engineers and on the development of              Our study also provides a good ground for software ed-
design smell detection tools.                                           ucators to demonstrate various design principles to students.
A. To Researchers                                                       As such, learners can practically observe examples of well-
                                                                        designed and poorly designed systems for a wide range of
  Despite the variations in diversity, distribution and magni-
                                                                        software systems.
tude of design smell between desktop and mobile, our result
indicates that almost all instances of design smells are well
                                                                        B. To Software Developers
represented in both software domains as shown in Figure 1.
Therefore, researchers studying design smells in one domain               We discuss three significant implications of this study to
have a high possibility of obtaining a representative set of            software developers as follows:
TABLE V: Notable differences between Java-based desktop                       believe that our study results can guide developers by quickly
and Android applications.                                                     pointing them to specific features of a source code that often
 No.   Desktop Application                  Android Application               result in poor software quality such as long method, complex
 1     Application entry point is de-       There is no main method           class or long parameter list.
       pendent on the existence of a        when developing mobile appli-
       special method i.e. the main         cations. The entry points are     C. On The Development of Design Smell Detection Tools
       method.                              given by event-handlers such as
                                            onCreate, onPause, onResume,         Design smell detection tools are significant not only for
                                            etc
 2     Application’s underlying GUI         There is another layer of ab-
                                                                              research purposes but also ensuring high-quality software
       is designed using Java Swing         straction i.e. complete separa-   design. The good news is that judging from Figure 1, detection
       library (core Java language)         tion of the application logic     tools developed for desktop application will probably always
                                            from its presentation. More-
                                            over, the GUI is constructed
                                                                              work for android applications as well. However, we believe
                                            using eXtensible Markup Lan-      that further improvements such as metrics optimization and
                                            guage (XML).                      enhancing code linting can significantly boost design smell
 3     Consist of all J2SE libraries,       Android does not have all J2SE    detection tools as discussed below.
       Swing and JavaFX, etc.               APIs, Swing or JavaFX.
 4     It solely depends on OOP             Although based on OOP, the           1) Metrics Optimization: Design smells are detected using
       paradigm                             android mobile apps employ        a combination of software metrics. However, Metrics-based
                                            reactive, event-driven program-   smell detection method has some known limitations such as
                                            ming paradigm
 5     The application directly bene-       Designed with resource limits     its inability to detect many smells using only commonly known
       fits from the host infrastructure,   in mind. Some of the app’s        metrics. Besides, metrics-based strategies heavily depend on
       which can be easily scaled ver-      capabilities are constrained by   the choice of best threshold value by the researcher, which
       tically or horizontally              the underlying hardware infras-
                                            tructures such as memory, stor-   is normally a significant challenge since this choice is almost
                                            age, processing power and pe-     always empirical and trial-and-error [11]. This research pro-
                                            ripherals.                        vides an opportunity for design smell detection tool developers
 6     Source code is compiled to           Source code is compiled to java
       Java Bytecode                        bytecode then to DEX byte-        to review those metrics and tailor them for the detection of
                                            code (two stages)                 specific design smell or combination of design smells based
                                                                              on the way they occur across desktop and android application.
                                                                              Moreover, developers can use the knowledge in Figure 4 to
   1) Software Design and Development: As shown in Figure                     optimize design smell detection tools to become more efficient
1, developers can know from the start of any new software                     through the use of just a few metrics to detect a combination
project that they should pay attention to specific implementa-                of design smells.
tion details of their application to mitigate common design                      2) Improve Code Linting: A large percentage of software
smells. For example; they can observe that LongMethod,                        engineers (both junior and senior) embrace the use of code
ComplexClass and LongParamaterList is most likely to occur                    linters in their daily development activity. A linter analyzes
in an application. We also show groups of design smells that                  source code to detect flaws, check style conventions, potential
frequently co-occur. These knowledge can help developers                      bugs and other code constructs [24]. However, most linters
to correctly plan their implementation and/or provide guide-                  cannot flag design smells. We believe that the results of this
lines for contributors to mitigate these design smells, thereby               study can motivate design smell detection tool builders to
limiting future software failure due to sloppy or unintended                  integrate design smell detection capability in linters as an
programming/implementation choices.                                           extension or plugin.
   2) Quality Assurance: Software Quality Assurance (SQA)
is an essential aspect of software engineering that involves                                   VI. T HREATS TO VALIDITY
processes and methods to ensure proper software quality such
                                                                              A. Construct Validity
as conformance to standards or models. This study provides
evidence to developers and quality assurance personnel of                        Our goal was to compare the occurrence of design smells in
the importance of design smell analysis in assessing the                      desktop and mobile applications. We believe that we were able
quality of their systems by showing the diversity, distribution               to achieve this goal by comparing the two groups in regards to
and magnitude of design smell which can negatively impact                     diversity, distribution, magnitude and co-occurrence of design
software maintenance effort.                                                  smells.
   3) Guided Code Review and Refactoring: Code review and
refactoring are common exercises carried out by developers to                 B. Internal Validity
(i) ensure code quality and (ii) improving the internal structure                In this study, we realized solely on ptidej tool suite for
of existing software code without affecting its observable                    the detection of design smells. Therefore, the accuracy of our
behavior [10]. However, the cost of reviewing and refactoring                 results also depends on the accuracy of this tool. However, the
source code becomes expensive in terms of time and resource,                  efficacy of ptidej has been evaluated in previous study [25].
especially for evolving software systems. Therefore, there is                 Besides, ptidej tool suite is freely available and able to detect
a need for the simplification of those processes. As such, we                 a large number of design smells.
   We carried out our analysis on just a single version of each                  [5]   G. Saranya, H. K. Nehemiah, A. Kannan, and V. Nithya, “Model
selected project. It is possible that the results can vary if                          level code smell detection using egapso based on similarity measures,”
                                                                                       Alexandria engineering journal, vol. 57, no. 3, pp. 1631–1642, 2018.
historical data is considered. However, since our focus was                      [6]   A. Barbez, F. Khomh, and Y.-G. Guéhéneuc, “A machine-learning
not on the analysis of software change history, we consider                            based ensemble method for anti-patterns detection,” Journal of Systems
this threat acceptable and an opportunity for future work.                             and Software, vol. 161, p. 110 486, 2020.
                                                                                 [7]   S. Kaur and S. Singh, “Influence of anti-patterns on software main-
C. External Validity                                                                   tenance: A review,” International Journal of Computer Applications,
                                                                                       vol. 975, p. 8887, 2015.
   Regarding the generalizability of our result, first, we are                   [8]   T. Guggulothu and S. A. Moiz, “An approach to suggest code smell
aware that we carried out this study on Java-based applica-                            order for refactoring,” in International Conference on Emerging Tech-
                                                                                       nologies in Computer Engineering, Springer, 2019, pp. 250–260.
tion only. Other platforms that use OOP languages exist for                      [9]   Z. Soh, A. Yamashita, F. Khomh, and Y.-G. Guéhéneuc, “Do code
example; windows mobile based on C#, Apple iOS based on                                smells impact the effort of different maintenance programming ac-
Objective-C/Swift, etc. However, we believe that the methods                           tivities?” In 2016 IEEE 23rd International Conference on Software
                                                                                       Analysis, Evolution, and Reengineering (SANER), IEEE, vol. 1, 2016,
used in this study can be generalized to other OOP systems in                          pp. 393–402.
various programming languages because the principle of OOP                      [10]   B. Turkistani and Y. Liu, “Reducing the large class code smell by
is consistent regardless of the implementing programming                               applying design patterns,” in 2019 IEEE International Conference on
                                                                                       Electro Information Technology (EIT), IEEE, 2019, pp. 590–595.
language.                                                                       [11]   T. Sharma and D. Spinellis, “A survey on software smells,” Journal
   The dataset used in this study was generated from only                              of Systems and Software, vol. 138, pp. 158–173, 2018.
12 GitHub projects. Although we believe that the size of our                    [12]   B. Bafandeh Mayvan, A. Rasoolzadegan, and A. Javan Jafari, “Bad
                                                                                       smell detection using quality metrics and refactoring opportunities,”
dataset is considerable, using a larger-sized dataset would give                       Journal of Software: Evolution and Process, e2255, 2020.
more confidence to the results presented in this paper.                         [13]   A. Kaur and S. Singh, “Detecting software bad smells from software
                                                                                       design patterns using machine learning algorithms,” International
            VII. C ONCLUSION AND F UTURE W ORK                                         Journal of Applied Engineering Research, vol. 13, no. 11, pp. 10 005–
                                                                                       10 010, 2018.
   In this paper, we conducted an exploratory study to compare                  [14]   H. Liu, Z. Xu, and Y. Zou, “Deep learning based feature envy detec-
design smells in desktop and mobile application using a sizable                        tion,” in Proceedings of the 33rd ACM/IEEE International Conference
                                                                                       on Automated Software Engineering, 2018, pp. 385–396.
dataset of twelve (12) Java-based open-source projects. We                      [15]   Y. Yin, Q. Su, and L. Liu, “Software smell detection based on machine
reported empirical evidence on the variations in diversity,                            learning and its empirical study,” in Second Target Recognition and
distribution, magnitude and co-occurrence of design smells                             Artificial Intelligence Summit Forum, International Society for Optics
                                                                                       and Photonics, vol. 11427, 2020, 114270P.
using statistical methods and unsupervised learning. The result                 [16]   N. Moha, Y.-G. Gueheneuc, L. Duchien, and A.-F. Le Meur, “Decor: A
of the study indicated that desktop and mobile application are                         method for the specification and detection of code and design smells,”
quite similar in term of design smell occurrence. We also found                        IEEE Transactions on Software Engineering, vol. 36, no. 1, pp. 20–36,
                                                                                       2009.
pairs/groups of design smells that often co-occur. Some of the                  [17]   F. Palomba, G. Bavota, M. Di Penta, F. Fasano, R. Oliveto, and A.
pairs/groups are expected (e.g. SpeculativeGenerality, Swis-                           De Lucia, “On the diffuseness and the impact on maintainability of
sArmyKinfe), while others (e.g. LongParameterList, Class-                              code smells: A large scale empirical investigation,” Empirical Software
                                                                                       Engineering, vol. 23, no. 3, pp. 1188–1221, 2018.
DataShouldBePrivate) require further study to understand any                    [18]   A. S. Cairo, G. d. F. Carneiro, and M. P. Monteiro, “The impact of code
innate relationships.                                                                  smells on software bugs: A systematic literature review,” Information,
   We plan to extend the study to include class role-stereotypes                       vol. 9, no. 11, p. 273, 2018.
                                                                                [19]   U. A. Mannan, I. Ahmed, R. A. M. Almurshed, D. Dig, and C.
[26]. It is quite intuitive that both design smell and role-                           Jensen, “Understanding code smells in android applications,” in 2016
stereotype play major roles in the design and maintenance                              IEEE/ACM International Conference on Mobile Software Engineering
of a software system. It would be interesting to see how                               and Systems (MOBILESoft), IEEE, 2016, pp. 225–236.
                                                                                [20]   S. Habchi, G. Hecht, R. Rouvoy, and N. Moha, “Code smells in ios
design smells vary across role-stereotypes in desktop and                              apps: How do they compare to android?” In 2017 IEEE/ACM 4th
android application. We are also interested in understanding                           International Conference on Mobile Software Engineering and Systems
the variation of design smells in cloud-native versus tradi-                           (MOBILESoft), IEEE, 2017, pp. 110–121.
                                                                                [21]   F. Palomba, R. Oliveto, and A. De Lucia, “Investigating code smell
tional applications. We find this important because numerous                           co-occurrences using association rule learning: A replicated study,” in
software development activity has now shifted to the cloud.                            2017 IEEE Workshop on Machine Learning Techniques for Software
                                                                                       Quality Evaluation (MaLTeSQuE), IEEE, 2017, pp. 8–13.
                            REFERENCES                                          [22]   Y.-G. Guéhéneuc, “Ptidej: A flexible reverse engineering tool suite,” in
 [1]   M. Fowler, Refactoring: Improving the Design of Existing Code.                  2007 IEEE International Conference on Software Maintenance, IEEE,
       Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc.,                   2007, pp. 529–530.
       1999, ISBN: 0-201-48567-2.                                               [23]   P. Taraba, “Clustering for binary featured datasets,” in The World
 [2]   A. Kaur, “A systematic literature review on empirical analysis of               Congress on Engineering and Computer Science, Springer, 2017,
       the relationship between code smells and software quality attributes,”          pp. 127–142.
       Archives of Computational Methods in Engineering, pp. 1–30, 2019.        [24]   S. Habchi, X. Blanc, and R. Rouvoy, “On adopting linters to deal
 [3]   A. Imran, “Design smell detection and analysis for open source                  with performance concerns in android apps,” in 2018 33rd IEEE/ACM
       java software,” in 2019 IEEE International Conference on Software               International Conference on Automated Software Engineering (ASE),
       Maintenance and Evolution (ICSME), IEEE, pp. 644–648.                           IEEE, 2018, pp. 6–16.
 [4]   D. Di Nucci, F. Palomba, D. A. Tamburri, A. Serebrenik, and A.           [25]   G. Rasool, P. Maeder, and I. Philippow, “Evaluation of design pattern
       De Lucia, “Detecting code smells using machine learning techniques:             recovery tools,” Procedia Computer Science, vol. 3, pp. 813–819, 2011.
       Are we there yet?” In 2018 IEEE 25th International Conference on         [26]   R. J. Wirfs-Brock, “Characterizing classes,” IEEE software, vol. 23,
       Software Analysis, Evolution and Reengineering (SANER), IEEE, 2018,             no. 2, pp. 9–11, 2006.
       pp. 612–621.