<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Proceedings of the SQAMIA</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Comparison of Software Structures in Java and Erlang Programming Languages</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>ANA VRANKOVIC´</string-name>
          <email>avrankovic@riteh.hr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>TIHANA GALINAC GRBAC</string-name>
          <email>tgalinac@riteh.hr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>University of Rijeka</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Faculty of Engineering</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>MELINDA TÓTH, ELTE Eötvös Loránd University</institution>
          ,
          <addr-line>Budapest</addr-line>
          ,
          <country country="HU">Hungary</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <volume>6</volume>
      <fpage>11</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>Empirical studies on fault behaviour in evolving complex software systems have shown that communication structures among the software entities such as classes, modules, software units and communications among them, is signicantly aecting the system fault behaviour. Therefore, we were motivated to further investigate software structures. One interesting question is to investigate software structures from software products written in dierent programming languages. In this work we present our preliminary study for which we developed tools to examine software structures of software products written in Java and Erlang programming language. We provide details on how we extract software structure from software product and provide preliminary results analyzing four Erlang software products and four Java software products.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        18:2
in terms of network graphs. Like we did in ou
        <xref ref-type="bibr" rid="ref10">r previous work [Milo et al. 2002</xref>
        ; Petric and Grbac 2014;
Petric et al. 2014a], we used thirteen subgraph types to study software structure, as presented in
Figure 1. These subgraphs cover all three-node connections. All subgraph types present directed graphs
where each node is a class/module and every edge is a connection between them. This preliminary
study is based on simple comparison of subgraph counts present in software structures obtained from
different Erlang and Java software products.
      </p>
      <p>The rest of the paper is organized as follows. At first, in Section 2 is description of background,
in Section 3 we present the used tools to extract the software structures in different programming
languages. Then in Section 4 we present the preliminary results obtained by simple comparison of
results obtained for Erlang and Java code. Finally, in Section 6 we conclude the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. BACKGROUND</title>
      <p>
        To analyze software structure we can define different types of structures, modules, classes and different
types of software unites. Graph theory is a field of study that looks into the formal description and
analysis of graphs [Bullmor
        <xref ref-type="bibr" rid="ref5">e and Sporns 2009</xref>
        ]. Part of graph theory study are also complex networks,
graphs that are based on real world networks, t
        <xref ref-type="bibr" rid="ref14">hey are discussed in [Simon 1991</xref>
        ]. Analyzing system
using network graphs and complex networks has been used in many scientific fields for a long time: in
medicine, for protein analysis [Aristóteles Góes-Netoa and et al. 2010], in logistics [Carlos PaisMontes
and Laxe 2013], crime analyses [Colladon and Remondi 2017], electrical system analyses [Alexandre
P. Alves da Silva and Souza 2012] and many more. In computer science there has been a few ideas
of using complex networks as a tool to better understand the software behavior. In [
        <xref ref-type="bibr" rid="ref15">S.Jenkins and
S.R.Kirk 2007</xref>
        ] software architecture graphs were presented as a complex networks using Java written
applications. Interesting finding was that as the software ages, more out-going calls than incoming
calls are present. In paper [Chong and Lee 2015] complex networks are used as a tool for analyzing
the complexity of software system based on object oriented approach. They used a weighted complex
network on a system to help them understand its maintainability and reliability. They also managed
to identify violations of common software design principals.
        <xref ref-type="bibr" rid="ref3">In [Luis G. Moyanoa and Vargas 2011</xref>
        ] the
community structure of a real complex software network is explored. The results of this paper shows a
significant dependence between community structure and internal dynamical processes. Relationships
between Erlang processes have been discussed in [Bozó and Tóth 2016]. No work has been found on
comparison between the Java and Erlang software structure. Since Java and Erlang are not similar
in paradigm and are not usually used in the similar products, the comparison between them is not
often explored. Therefore we wanted to compare them because of their differences to see if there is any
variation in the way they communicate in terms of the subgraph type.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. TOOLS</title>
      <p>In this work we used four different tools.</p>
      <p>For analyzing Java written applications we used the tool rFind [Petric and Grbac 2014; Petric et al.
2014a]. The input of rFind is an application code written in Java.</p>
      <p>As an output we receive two files that allow us to see all calls between classes. One of those files is
.classlist where a list of all classes is displayed using class ids for easier reading. Each class represents
a node in the network. The other file is a .graph where all connections between class ids are presented.
Every connection is viewed as an edge between nodes.</p>
      <p>For getting the same information from Erlang applications we used the tool RefactorErl [Bozó et al.
2011]. RefactorErl is an open source static source code analyser and transformer tool for Erlang.
RefactorErl supports dependency examination both on module and function level, and is able to present it
as a graph to the user. The input of the tool were applications written in Erlang, and it was able to
produce a textual representation of dependencies as an output. Although the presentation of the result
was quite different from the output of rFind, but the main idea was the same: present communication
between modules.</p>
      <p>The SuBuCo tool [Petric et al. 2014b] is an application that expects an Rfind output as input: the
.classlist and .graph. Then it searches for three-node subgraph structures inside .graph file. Its output
is a list of all subgraphs that appear in the given code. The file created contains a list of all subgraphs
separated by subgraph type and ids of every class/module contained in specific subgraph.</p>
      <p>Since the output of RefactorErl was not in the form for SuBuCo analysis, we wrote a parser to adjust
the result so that it also contains .classlist and .graph. The parser was implemented in Java where
the input files were files gathered from RefactorErl and the output files were the two needed files. The
whole analysis process can be seen in Figure 2.</p>
    </sec>
    <sec id="sec-4">
      <title>4. RESULTS</title>
      <p>Our tests were conducted on four different software implemented in the Erlang programming language
and four written in Java. The analysed Erlang software are: Mnesia for distributed
telecommunications database; Dialyzer that allows static analysis for identifying software discrepancies; Cowboy
which is a http server for Erlang/OTP; and RabbitMQ server that runs a multi-protocol messaging
18:4
broker. The former two are part of the standard Erlang/OPT distribution, the latter two applications
were taken from open git repositories. For analyzing software written in Java we used Java
Development Kit (JDT) and Plug-in Development Environment (PDE) projects from Eclipse project, Open
Microscopy Environment that is an open-source software and data format standards for the storage
and manipulation of biological microscopy data, and Ultimate Android, development framework, from
git repository.</p>
      <p>ERLANG PROJECT
Mnesia
Cowboy
Dialyzer
RabbitMq
JAVA PROJECT
OpenMicroscopy
Ultimate Android
JDT
PDE</p>
      <p>The number of edges and nodes for each tested software can be seen in Table I. Number of edges
seems to be much larger in Java software, even where number of nodes is lesser then in Erlang
software. In examples where the number of nodes are similar, Mnesia and Ultimate Android applications,
number of edges is still much greater in Java application than in Erlang. Communication is far more
common in Java written software. We can see that in all tested applications number of edges grows
with the number of nodes.</p>
      <p>Subgraph ids discussed in this section are referring to the network subgraphs in Figure 1. In three
out of four applications in Erlang subgraph with id 36 was the most common. The same subgraph
id also was the most present in both Java projects and in projects gathered from git repositories. It
seems that the communication in which multiple classes/modules heavily use one library is the most
frequent one. Only one had different results, Dialyzer. We can see from the Tables II-V that in all
Erlang applications subgraph ids 36,6 and 12 are the most common ones, most often in that exact
order. Subgraph with id 6 presents communications where one node needs multiple resources from
other nodes and the subgraph with id 12 could be the situation where communication flows from one
node to the other and when the second node is triggered he calls for the third node. In Tables VI-IX.
we can see that Java projects behave similarly. In all projects it is the same order of frequency while
in PDE id 38 is present more often than 12.</p>
      <p>In terms of subgraph id appearance, we can see that subgraphs with id 238 and 110 do not appear
in any of Erlang application and neither in Java applications. Subgraph with ids 102 and 98 were
found in Erlang application, but not in any of Java applications. We can see that applications written
in Erlang and Java have similar behavior in terms of subgraph id appearance, even though Java
software products are greater in class/module size.
18:6</p>
      <p>Looking at Pareto diagrams on Figures 3 and 4 we can see that there is significant growth only for
subgraph types 36,6,12 and 38 in both Java projects and Erlang projects.</p>
    </sec>
    <sec id="sec-5">
      <title>5. THREATS TO VALIDITY</title>
      <p>Data collection and analysis is possible on any code written in Erlang or Java. Erlang applications that
were tested are server implementations, database and static analytic tool. Java software applications
were frameworks for developing Java software and software for working with specific types of data.
Software function is not the same in Erlang and Java applications. Since Erlang is a language used
for scalable soft real-time systems and Java is general purpose programming language, comparing
software applications written in each of them could not give us generalized conclusions. Comparing
similar types of languages could be a better approach.</p>
    </sec>
    <sec id="sec-6">
      <title>CONCLUSION</title>
      <p>In this study our main focus was to analyze code structure on software written in Erlang and compare
it to the software written in Java.</p>
      <p>To do that we used several tools and combined them together to get the appropriate output that we
can analyze. We represented class/module communication using thirteen subgraph types.</p>
      <p>In code written in Java, there was much larger number of communicating classes in comparison to
communicating functions in Erlang code. Subgraph types 38,36,6,46,12,74 and 14 were present in all
tested code. Types 108 and 78 were present in two software applications. Besides id 46, they are the
only ones present that have the number of edges higher then 3. The one with the highest occurrence
was subgraph type 36 in every tested application, followed by types 6 and 12. Subgraph ids 102,238,98
and 110 did not appear at all. There is no communication where more then four interactions between
three classes are existent.</p>
      <p>Unlike in Java applications, in Erlang applications subgraph id 98 occurred in all tested applications
and id 102 appeared in two tested applications. Subgraph with id 98 is the only one where
communication is circular, it starts and ends in the same node with just one interaction between each node.
Just as in Java application, in Erlang applications ids 36,6 and 12 had the highest occurrence but in
different percentage. While in Java applications subgraph id 36 occupied over 89% of all subgraphs, in
Erlang the same id occupied between 58% and 68% while in Dialyzer it had appearance of only 34%.
On tested java software, had a low appearance rate of under 10%. In Erlang applications result was
different. Id 6 had a presence of around 20% in Mnesia and RabbitMQ. In Dialyzer, it had the largest
number of appearance, 51%.</p>
      <p>We can see that in Java written code, subgraph id 36 occupies more then 90% of all the
communication while in Erlang code, ids 36 and 6 together occupy 80-90%.</p>
      <p>Based on the code analysis, we can conclude that although there is similar behavior between
languages, there are some differences. There are structures that appear in Erlang, but not in Java.
Specifically structures where there is more communication edges between modules and id 98 where
communication is circular. It is possible that those types are specific to that language. There is also a difference
in percentage of the subgraph. While id 36 is in an extensive number of subgraph in Java, types of
communication where one library is being heavily used by other classes, in Erlang that number is
much lesser. We can see that the usage of libraries is greater in Java programs. There is also a big
difference in number of communicating classes/modules. It seems that classes in Java programming
language tend to communicate more often than Erlang modules. It is possible that those results are
because of the fact that Java is an object oriented language and is based on object communication.</p>
      <p>In our future work we aim to do the analysis on code written in other programming and scripting
languages, both functional and object oriented. Doing that we can come to the determinant conclusion
in aspect of which subgraph types are specific for individual programming languages or applications.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Antonio C.S.</given-names>
            <surname>Lima Alexandre P. Alves da Silva and Suzana M. Souza</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Fault location on transmission lines using complexdomain neural networks</article-title>
          .
          <source>Electrical Power and Energy Systems</source>
          <volume>43</volume>
          (Dec.
          <year>2012</year>
          ),
          <fpage>720</fpage>
          -
          <lpage>727</lpage>
          . https://doi.org/10.1016/j.ijepes.
          <year>2012</year>
          .
          <volume>05</volume>
          .046
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Marcelo V.C.</surname>
          </string-name>
          <article-title>Diniza Aristóteles Góes-Netoa and</article-title>
          et al.
          <year>2010</year>
          .
          <article-title>Comparative protein analysis of the chitin metabolic pathway in extant organisms: A complex network approach</article-title>
          .
          <source>BioSystems 101</source>
          , 1
          <issue>(</issue>
          <year>July 2010</year>
          ),
          <fpage>59</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>I.</given-names>
            <surname>Bozó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Horpácsi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kitlei</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z.</surname>
          </string-name>
          n Horváth, J. Ko˝szegi, M. Tejfel, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Tóth</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>RefactorErl-source code analysis and refactoring in Erlang</article-title>
          .
          <source>In Proceeding of the12th Symposium on Programming Languages and Software Tools. Tallin</source>
          , Estonia.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>István</given-names>
            <surname>Bozó</surname>
          </string-name>
          and
          <string-name>
            <given-names>Melinda</given-names>
            <surname>Tóth</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Analysing and Visualising Erlang Behaviours</article-title>
          .
          <source>AIP Conference Proceedings</source>
          <volume>1738</volume>
          (
          <year>June 2016</year>
          ). http://dx.doi.org/10.1063/1.4952023
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>E.</given-names>
            <surname>Bullmore</surname>
          </string-name>
          and
          <string-name>
            <given-names>O.</given-names>
            <surname>Sporns</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Complex brain networks: graph theoretical analysis of structural and functional systems</article-title>
          .
          <source>Nat Rev Nurosci</source>
          <volume>10</volume>
          (
          <year>April 2009</year>
          ),
          <fpage>186</fpage>
          -
          <lpage>198</lpage>
          . DOI:http://dx.doi.org/10.1038/nrn2575
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Maria</given-names>
            <surname>Jesus Freire Seoane Carlos PaisMontes and Fernando Gonzalez Laxe</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>General cargo and containership emergent routes: Acomplex networks description</article-title>
          .
          <source>Transport Policy</source>
          <volume>24</volume>
          (
          <issue>Nov</issue>
          .
          <year>2013</year>
          ),
          <fpage>126</fpage>
          -
          <lpage>140</lpage>
          . https://doi.org/10.1016/j.tranpol.
          <year>2012</year>
          .
          <volume>06</volume>
          .022
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>Chun Yong Chong and Sai Peck Lee</source>
          .
          <year>2015</year>
          .
          <article-title>Analyzing maintainability and reliability of object-oriented software using weighted complex network</article-title>
          .
          <source>Journal of Systems and Software 110 (Dec</source>
          .
          <year>2015</year>
          ),
          <fpage>28</fpage>
          -
          <lpage>53</lpage>
          . https://doi.org/10.1016/j.jss.
          <year>2015</year>
          .
          <volume>08</volume>
          .014
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Fronzetti</surname>
          </string-name>
          Colladon and
          <string-name>
            <given-names>Elisa</given-names>
            <surname>Remondi</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Using Social Network Analysis to Prevent Money Laundering</article-title>
          .
          <source>Expert Systems With Applications</source>
          <volume>67</volume>
          (Jan.
          <year>2017</year>
          ),
          <fpage>49</fpage>
          -
          <lpage>58</lpage>
          . https://doi.org/10.1016/j.eswa.
          <year>2016</year>
          .
          <volume>09</volume>
          .029
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Mary</given-names>
            <surname>Luz Mourontea Luis G. Moyanoa and Maria Luisa Vargas</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Communities and dynamical processes in a complex software network</article-title>
          .
          <source>Physica A 390</source>
          ,
          <issue>4</issue>
          (Feb.
          <year>2011</year>
          ),
          <fpage>741</fpage>
          -
          <lpage>748</lpage>
          . https://doi.org/10.1016/j.physa.
          <year>2010</year>
          .
          <volume>10</volume>
          .026
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>R.</given-names>
            <surname>Milo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shen-Orr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Itzkovitz</surname>
          </string-name>
          , and et al.
          <year>2002</year>
          .
          <article-title>Network motifs: simple building blocks of complex networks</article-title>
          .
          <source>Science (Oct</source>
          .
          <year>2002</year>
          ),
          <volume>298</volume>
          :
          <fpage>824</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Jean</given-names>
            <surname>Petric</surname>
          </string-name>
          and Tihana Galinac Grbac.
          <year>2014</year>
          .
          <article-title>Software structure evolution and relation to system defectiveness</article-title>
          .
          <source>EASE</source>
          (May
          <year>2014</year>
          ). DOI:http://dx.doi.org/10.1145/2601248.2601287
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Jean</given-names>
            <surname>Petric</surname>
          </string-name>
          , Tihana Galinac Grbac, and
          <string-name>
            <given-names>Mario</given-names>
            <surname>Dubravac</surname>
          </string-name>
          . 2014a.
          <article-title>Processing and Data Collection of Program Structures in Open Source Repositories</article-title>
          .
          <source>In Proceedings of the 3rd Workshop on Software Quality Analysis, Monitoring, Improvement and Applications (SQAMIA</source>
          <year>2014</year>
          ), Lovran, Croatia,
          <source>September 19-22</source>
          ,
          <year>2014</year>
          .
          <fpage>57</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Petric</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Galinac</given-names>
            <surname>Grbac</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Dubravac</surname>
          </string-name>
          . 2014b.
          <article-title>Software structure evolution and relation to system defectiveness</article-title>
          .
          <source>In Proceedings of SQAMIA 2014</source>
          . Lovran,Croatia,
          <fpage>57</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>H.</given-names>
            <surname>Simon</surname>
          </string-name>
          .
          <year>1991</year>
          .
          <article-title>The Architecture of Complexity</article-title>
          ,
          <source>in: Facets of Systems Science (1st ed.)</source>
          . Springer, Boston, MA, USA.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>S.</given-names>
            <surname>Jenkins</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.R.</given-names>
            <surname>Kirk</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Software architecture graphs as complex networks: A novel partitioning scheme to measure stability and evolution</article-title>
          .
          <source>Information Sciences</source>
          <volume>177</volume>
          ,
          <issue>12</issue>
          (
          <year>June 2007</year>
          ),
          <fpage>2587</fpage>
          -
          <lpage>2601</lpage>
          . https://doi.org/10.1016/j.ins.
          <year>2007</year>
          .
          <volume>01</volume>
          .021
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>