<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OSSMETER: Automated Measurement and Analysis of Open Source Software</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bruno Almeida</string-name>
          <email>bruno.almeida@unparallel.pt</email>
          <xref ref-type="aff" rid="aff8">8</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sophia Ananiadou</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandra Bagnato</string-name>
          <email>alessandra.bagnato@softeam.fr</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Berreteaga Barbero</string-name>
          <email>Alberto.Berreteaga@tecnalia.com</email>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juri Di Rocco</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Di Ruscio</string-name>
          <email>davide.diruscio@univaq.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dimitrios S. Kolovos</string-name>
          <email>dimitris.kolovos@york.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ioannis Korkontzelos</string-name>
          <email>ioannis.korkontzelos@manchester.ac.uk</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Scott Hansen</string-name>
          <email>s.hansen@opengroup.org</email>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pedro Maló</string-name>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicholas Matragkas</string-name>
          <email>nicholas.matragkas@york.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Richard F. Paige</string-name>
          <email>richard.paige@york.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jurgen Vinju</string-name>
          <email>Jurgen.Vinju@cwi.nl</email>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centrum Wiskunde &amp; Informatica</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science University of York</institution>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dipartimento di Ingegneria e Scienze dell'Informazione e Matematica University of L'Aquila</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>National Centre for Text Mining (NaCTeM) University of Manchester</institution>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>SOFTEAM</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>TECNALIA</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>The Open Group</institution>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff7">
          <label>7</label>
          <institution>UNINOVA</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff8">
          <label>8</label>
          <institution>UNPARALLEL</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Deciding whether an open source software (OSS) meets the required standards for adoption in terms of quality, maturity, activity of development and user support is not a straightforward process. It involves analysing various sources of information, including the project's source code repositories, communication channels, and bug tracking systems. OSSMETER extends state-of-the-art techniques in the field of automated analysis and measurement of open-source software (OSS), and develops a platform that supports decision makers in the process of discovering, comparing, assessing and monitoring the health, quality, impact and activity of opensource software. To achieve this, OSSMETER computes trustworthy quality indicators by performing advanced analysis and integration of information from diverse sources including the project metadata, source code repositories, communication channels and bug tracking systems of OSS projects.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Project data</title>
      <p>– Partners: The Open Group - Project Coordinator (Belgium), University of York
- Technical Coordinator (UK), University of L’Aquila (IT), Centrum Wiskunde &amp;
Informatica (NL), University of Manchester (UK), Tecnalia Research and
Innovation (ES), UNINOVA (PT), SOFTEAM (FR), Unparallel Innovation (PT)
– Start date: 1 October 2012, Duration: 30 months
– Website: http://www.ossmeter.eu</p>
    </sec>
    <sec id="sec-2">
      <title>2 Introduction</title>
      <p>Deciding whether an open source software (OSS) project meets the required standards
for adoption in terms of quality, maturity, activity of development and user support is
not a straightforward process; it involves analysing various sources of information –
including its source code repositories – to identify how actively the code is developed,
which programming languages are used, how well the code is commented, whether
there are unit tests etc. Additional information may be pertitent to the analysis, including
that from communication channels such as newsgroups, forums and mailing lists to
identify whether user questions are answered in a timely and satisfactory manner, to
estimate the number of experts and users of the software, its bug tracking system to
identify whether the software has many open bugs and at which rate bugs are fixed, and
other relevant metadata such as the number of downloads, the license(s) under which
it is made available, its release history etc. This task becomes even more challenging
when one needs to discover and compare several OSS projects that offer software of
similar functionality (e.g., there are more than 20 open source XML parsers for the Java
programming language), and make an evidence-based decision on which one should
be selected for the task at hand. Moreover, even when a decision has been made for
the adoption of a particular OSS product, decision makers need to be able to monitor
whether the OSS project continues to be healthy, actively developed and adequately
supported throughout the lifecycle of the software development project in which it is
used, in order to identify and mitigate any risks emerging from a decline in the quality
indicators of the project in a timely manner. Previous work in the field of OSS analysis
and measurement has mainly concentrated on analysing the source code behind OSS
software to calculate quality indicators and metrics.</p>
      <p>OSSMETER extends the scope and effectiveness of OSS analysis and
measurement with novel contributions on language-agnostic and language-specific methods for
source code analysis, but also proposes using state-of-the-art Natural Language
Processing (NLP) and text mining techniques such as question/answer extraction,
sentiment analysis and thread clustering to analyse and integrate relevant information
extracted from communication channels (newsgroups, forums, mailing lists), and bug
tracking systems supporting OSS projects, in order to provide a more comprehensive
picture of the quality indicators of OSS projects, and facilitate better evidence-based
decision making and monitoring. OSSMETER also provides metamodels for capturing the
meta-information relevant to OSS projects, and effective quality indicators, in a
rigorous and consistent manner that enable direct comparison between OSS projects. These
contributions are integrated in the form of an extensible cloud-based platform through
which users can register, discover and compare OSS projects, but which can also be
extended in order to support quality analysis and monitoring of proprietary software
development projects. To summarize the scientific and technological objectives achieved
by OSSMETER are:
– comprehensive domain modelling for the domain of open source software
development; identification and formal representation of the meta-information that needs
to be captured in order to extract meaningful quality indicators for OSS projects;
– extraction of quality metrics by analysing aspects related to the source code and
the development team behind an OSS project;
– extraction of quality metrics related to the communication channels, and bug
tracking facilities of OSS projects using Natural Language Processing and text mining
techniques;
– development of an extensible cloud-based platform that can monitor and
incrementally analyse a large number of OSS projects, and a web-application to present their
related quality metrics in an intuitive manner that aids decision making.</p>
      <p>In the next sections such objectives are described. For each of them the progress
beyond the state of the art is also discussed.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Domain Modeling and OSS project Lifecycle Analysis</title>
      <p>State of the art: Modeling and abstracting open source software and its management
have been the focus of a number of projects and research activities aiming at
understanding the current practice in OSS projects e.g., for information and documentation
purposes. The Qualipso project10 analysed many OSS projects in order to identify typical
roles (e.g., user, maintainer, and developer), information sources (e.g., help documents,
release notes, and source code repositories), and their relations. Qualipso analysed also
widely used forges (e.g., SourceForge, and Google Code) in order to identify services,
which are typically provided to forge users, and the metadata which is used to describe
and support OSS projects. This has been done since there is not a common agreement
about the formats and metadata, which have to be used in the whole lifecycle of OSS
projects. This hampers the definition of homogeneous treatments of projects maintained
in different forges.</p>
      <p>
        Other works (e.g., [
        <xref ref-type="bibr" rid="ref10 ref9">9,10</xref>
        ]) created abstract models of OSS projects in order to
understand their architecture, and their evolution over time. In particular, [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] addresses the
structural characteristics of OSS projects, explicitly the organization of the software’s
constituent components. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] the authors, by leveraging the “4+1” view model [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ],
and the four architectural views of software systems defined in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], focus on the views
which are closer to the work of OSS software developers, such as, for instance, the
directory and the file level. The work in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposes models and metrics to support
the defect prediction for OSS projects. In particular, in addition to static code attributes
for modeling software data in defect prediction, the authors introduce alternative metric
sets, such as history and organizational metrics.
      </p>
      <p>
        To improve both the quality and the trustworthiness perception of OSS products, [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
introduces the idea of certifying the testing process of an OSS system. In this respect,
10 Qualipso: Leveraging Open Source for Boosting Industry Growth. http://www.
      </p>
      <p>
        qualipso.org/
the authors identify peculiar characteristics of OSS projects, that might influence the
testing process. The work defines also a certification model that companies, developers,
and final users can follow to evaluate the maturity level of an OSS testing process.
Innovation: According to the works previously outlined, the whole life-cycle of OSS
projects can be analyzed by means of ad-hoc techniques specifically defined to retrieve
heterogeneous information available from different sources in different formats.
OSSMETER advances state-of-the-art techniques by providing the means to create models
representing in a homogeneous manner different aspects of OSS projects in order to
enable objective comparisons of OSS alternatives with respect to user needs, and quality
requirements [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. In particular, OSSMETER has developed:
– Metamodels for the specification of models representing the whole lifecycle of
OSS projects. By considering and enhancing existing domain models, a set of
EMF/Ecore11 based metamodels and supporting tools have been conceived in order to
enable the representations of OSS projects;
– Metamodels for OSS project metrics to enable automated measurement of open
source software.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Source Code Quality and Activity Analysis</title>
      <p>
        State of the art: Software metrics are a widely studied subject and are used in practice,
for instance in the form of Function Points (FP) to measure the size of software (see
International Function Point User Group, IFPUG12). Software metrics are widely used for
the global analysis of productivity and quality of software [
        <xref ref-type="bibr" rid="ref12 ref15">12,15</xref>
        ]. All work on activity
analysis is ultimately based on the original work of Lehman [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] who also coined the
term software evolution. There is a wide range of tools available for performing specific
analyses on source code as well as for computing various metrics. Regarding analyses,
it is not easy to combine the results of different analyses and for metrics the same holds:
the results produced by different tools are incomparable since they use different
definitions for the underlying metrics. In addition, most of these tools are hand-coded and
have to be reimplemented for different languages.
      </p>
      <p>Innovation: OSSMETER provides an integrated view and corresponding tooling to
do analyses, metrics calculations and activity analysis on several implementation
languages. The main innovation are:
– Definition and of a coherent set of indicators for code quality and activity
analysis. These indicators are usable across different implementation languages, different
implementation platforms, and different version repository systems;
– Generation of the required tooling from declarative metrics descriptions using
innovative model-driven/ generation-based techniques.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Communication Channel and Bug Tracking System Analysis</title>
      <p>
        State of the art: Structuring and analysing textual data in forum, newsgroup and
community-based question and answer threads is a newly emerging and complex problem
11 Eclipse EMF: https://www.eclipse.org/modeling/emf/
12 http://www.ifpug.org/
in text mining. Peer users are the cornerstone of managing software defects in OSS,
due to their involvement in online forums [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Nevertheless, empirical studies regarding
open source quality assurance activities and quality claims are rare [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. OSS forums
and bug-tracking systems concentrate vast amounts of knowledge generated daily about
problems and their solutions as well as feedback to requests for OSS improvement.
      </p>
      <p>
        Mining this textual data can match solutions to problems, evaluate solutions quality
and impressively enhance user access to solutions and support [
        <xref ref-type="bibr" rid="ref13 ref21 ref7">7,21,13</xref>
        ]. Due to the size
of this textual information, extracting, managing and evaluating it without manual
intervention is a demanding, costly, impractical and probably impossible task. Text mining
tools that automatically analyse, extract, summarise and assess information found in
the threads of discussions on online forums are valuable for supporting OSS. Although
text mining techniques have been used extensively in domains such as biomedicine [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
finance [
        <xref ref-type="bibr" rid="ref16 ref19">19,16</xref>
        ], competitive intelligence [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], very little work has been accomplished
on applying text mining techniques for analysing threads of online forums.
Innovation: The target of this analysis is to extract from OSS forums and bug-tracking
systems as many indicators about the characteristics and the quality of the
communication that takes place as possible. Due to the complexity of the problem, a number
of text mining technologies have been combined and structured in levels: after
collecting online forum threads, the first level consists of identifying the types of each post
as question, answer or supplementary text (context). In succession, posts are classified
into more fine-grained categories and similarity-based methods are employed to
identify chains of questions, contexts and answers within each thread, i.e. identify which
answers and context correspond to which question. Thirdly, posts are analysed as far as
sentiment and attitude is concerned. The output of this stage is a fundamental source
of evidence useful for quality assessments. Finally, clustering together semantically
similar threads and labelling the resulting clusters provides hints about the error-prone
aspects of each OSS or its parts that need to be improved. The output of each level is
two-fold: a number of indicators about the input posts quality that concerns the specific
aspect that the corresponding component exploits; and also, supplementary output
useful for the following components, but not necessarily part of the overall system output.
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>OSSMETER platform</title>
      <p>
        State of the art: In the last decade several projects have provided platforms that
support automated measurement of open source software including FLOSSMETRICS [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
Qualoss [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], SQO-OSS (Alitheia Core) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and Ohloh [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Also, many OSS forges such
as SourceForge, Google Code and GitHub provide built-in annotation and measurement
facilities for the OSS projects they host.
      </p>
      <p>
        The aim of the FLOSS 13 project was to develop indicators of
non-monetary/transmonetary economic activity through a case study of OSS, and to assess OSS business
models and best practices, and policy/regulatory impact. Its successor
FLOSSMETRICS project [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] integrated a number of source code and bug tracking and mailing list
extraction tools into a web-based platform which monitors a selection of open source
projects and provides the extracted data in the form of SQL files which then need to be
injected into a local database in order to be further analysed.
13 http://www.flossproject.org/
      </p>
      <p>
        Qualoss [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] aimed at automating the quality measurement of open source software.
The Qualoss platform has been conceived to analyse two types of data: source code and
project-repository information and does not appear to be measuring aspects related to
communication channels or bug tracking systems of OSS projects.
      </p>
      <p>
        Alitheia-Core [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is a platform which aims at enabling software engineering
research targeting OSS projects. Alitheia-Core provides support for processing source
code repositories, emails from mailing lists and bug tracking systems through an API
that developers can use in order to implement metrics and experiments. The design of
Alitheia-Core is similar to the envisioned design of the OSSMETER platform, but
the platform itself does not appear to be providing any implemented metrics related to
mailing list and bug tracking systems.
      </p>
      <p>
        Ohloh [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is a free but proprietary and closed source system, only provided as a
hosted service. Ohloh only analyses information related to the source code of OSS
projects and does not take communication channels or bug tracking systems into
consideration. However, it provides OSS project classification facilities (through user-defined
tags), enables OSS project discovery and comparison, and presents source-code and
activity-related metrics in an intuitive and understandable manner. On the downside,
beyond not taking communication channels and bug tracking systems into consideration,
being closed source means that
organisations cannot run their own lo- OSSMETER Web Application
cal instance of Ohloh through which
they could monitor only the open consumes
source projects they are interested REST API
in, or their own proprietary projects. Metric Providers Fact Providers
Also, as the system is proprietary, OSSMETER Platform
developers cannot extend it with fea- Persistence
tures such as support for new met- Database Filesystem
rics, access to additional sources
of information, or integration with Fig. 1. OSSMETER system architecture
custom version control management
systems.
      </p>
      <p>As mentioned above, OSS forges such as SourceForge, Google Code and GitHub
provide built-in facilities for capturing additional information (metadata) about projects
such as the category they belong to, the languages they are implemented in, relevant
news feeds, and activity indicators such as user reviews, number of developers, and
number of downloads. However, each OSS forge captures a different set of metadata
and as such, projects hosted in different forges are not directly comparable. Moreover,
none of these forges provides advanced source code, communication channel, and bug
tracking system content analysis features such as those proposed by OSSMETER .
Innovation: The OSSMETER platform integrates and extends components and results
produced by the projects discussed above in order to provide the comprehensive system
shown in Fig. 1 for analysing and monitoring OSS projects. The novel features of the
OSSMETER system are:
. a scalable and efficient data storage, which is responsible for storing and retrieving
project specific metadata, and metric measurements. The use of local disk storage is
also enabled to store temporary data required for the analysis, such as clones of source
repositories.
. support for automated classification of OSS projects and discovery of related projects
based on source code, communication channel and bug tracking system analysis through
the use of advanced NLP and text mining techniques. To this end different kinds of
measure components are provided, namely fact providers, metric providers, and
factoids. Fact providers perform utility measurements and store factual data that can be
consumed by other fact/metric providers. Metric providers optionally use computed
facts to measure one or more project aspects and store the result in the database.
Finally, factoids can aggregate heterogeneous metric providers into a four-star system.
. an extensible platform implemented using a plug-in based approach (OSGi), which
is responsible for the integration of the various OSSMETER components, as well as
for their scheduling, execution, and orchestration. The OSSMETER platform is also
responsible of mining the OSS data, which are then passed to the various metrics
providers for analysis.
. a REST API that enables software engineering researchers to access calculated quality
indicators in order to perform additional analysis, and developers of 3rd party software
to provide added-value services on top of the OSSMETER platform.
. a usable web-application developed on top of the platform that enables end-users
to explore and compare OSS software in an intuitive manner. The presentation of the
information about software projects can be fully customised at the user level and it is
based on custom quality models.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>OSSMETER is officially ended on March 31, 2015. When writing this document, the
use case providers were performing the evaluation of the OSSMETER technologies
by considering real OSS projects from different application domains. Some of the
projects considered during the evaluation are shown in Table 1. These projects were
chosen based on their characteristics, such as size, age, number of developers, and
number of commits. The code of the OSSMETER platform is publicly available online at
https://github.com/ossmeter/ossmeter. It is possible to download a
locallydeployable version of the OSSMETER system that users can install locally – and if
needed extend – in order to monitor a custom selection of OSS projects of interest
and/or internal software development projects. By mentioning some facts updated at
June 2015, the OSSMETER GitHub repository counted more than 650K lines of code,
1,800 commits, 3 branches, 8 releases, and 8 contributors. More than 30 technical
deliverables were produced to present the technologies developed during the project. The
official installation of OSSMETER is available at www.ossmeter.com.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>1. FLOSSMETRICS: Free/Libre/Open Source Software Metrics. http://www.flossmetrics.org/.</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Ohloh</given-names>
            <surname>Project</surname>
          </string-name>
          . http://www.sqo-oss.org/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>3. QUALOSS: Quality in Open Source Software. http://www.qualoss.org/.</mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>SQO-OSS</surname>
          </string-name>
          :
          <article-title>Alitheia Core</article-title>
          . http://www.sqo-oss.org/.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Faheem</given-names>
            <surname>Ahmed</surname>
          </string-name>
          , Piers Campbell, Ahmad Jaffar, and Luiz Fernando Capretz.
          <article-title>Managing support requests in open source software project: The role of online forums</article-title>
          .
          <source>In ICCSIT 2009</source>
          , pages
          <fpage>590</fpage>
          -
          <lpage>594</lpage>
          . IEEE,
          <year>August 2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Sophia</given-names>
            <surname>Ananiadou</surname>
          </string-name>
          and
          <string-name>
            <given-names>John</given-names>
            <surname>Mcnaught</surname>
          </string-name>
          .
          <article-title>Text Mining for Biology And Biomedicine</article-title>
          . Artech House, Inc., Norwood, MA, USA,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Timothy</given-names>
            <surname>Baldwin</surname>
          </string-name>
          , David Martinez, Richard B. Penman,
          <string-name>
            <given-names>Su N.</given-names>
            <surname>Kim</surname>
          </string-name>
          , Marco Lui,
          <string-name>
            <given-names>Li</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <article-title>Andrew MacKinlay. Intelligent linux information access by data mining: the ILIAD project</article-title>
          .
          <source>In Procs. NAACL HLT</source>
          <year>2010</year>
          , pages
          <fpage>15</fpage>
          -
          <lpage>16</lpage>
          . Association for Computational Linguistics,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Bora</given-names>
            <surname>Caglayan</surname>
          </string-name>
          , Ayse Bener, and
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Koch</surname>
          </string-name>
          .
          <article-title>Merits of using repository metrics in defect prediction for open source projects</article-title>
          .
          <source>In Procs. of FLOSS'09</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>36</lpage>
          . IEEE Computer Society,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Capiluppi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Karl</given-names>
            <surname>Beecher</surname>
          </string-name>
          .
          <article-title>Structural complexity and decay in floss systems: An inter-repository study</article-title>
          .
          <source>In Procs. of CSMR '09</source>
          , pages
          <fpage>169</fpage>
          -
          <lpage>178</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Andrea</surname>
            <given-names>Capiluppi</given-names>
          </string-name>
          , Cornelia Boldyreff, and
          <string-name>
            <surname>Klaas-Jan Stol</surname>
          </string-name>
          .
          <article-title>Successful reuse of software components: A report from the open source perspective</article-title>
          .
          <source>In Open Source Systems: Grounding Research - 7th IFIP WG 2</source>
          .13 International Conference,
          <source>OSS 2011</source>
          , pages
          <fpage>159</fpage>
          -
          <lpage>176</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Robert</given-names>
            <surname>Glass</surname>
          </string-name>
          .
          <article-title>Is open source software more reliable? an elusive answer</article-title>
          .
          <source>The Software Practitioner</source>
          ,
          <volume>11</volume>
          (
          <issue>6</issue>
          ),
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>R.</given-names>
            <surname>Grady</surname>
          </string-name>
          .
          <source>Effective Software Measurements. Prentice Hall</source>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ahmed</surname>
            <given-names>Hassan</given-names>
          </string-name>
          , Vahed Qazvinian, and
          <string-name>
            <given-names>Dragomir</given-names>
            <surname>Radev</surname>
          </string-name>
          .
          <article-title>What's with the attitude?: identifying sentences with attitude in online discussions</article-title>
          .
          <source>In Procs. EMNLP'10</source>
          , pages
          <fpage>1245</fpage>
          -
          <lpage>1255</lpage>
          , Stroudsburg, PA, USA,
          <year>2010</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Christine</surname>
            <given-names>Hofmeister</given-names>
          </string-name>
          ,
          <source>Robert Nord, and Dilip Soni. Applied Software Architecture. AddisonWesley</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>Capers</given-names>
            <surname>Jones</surname>
          </string-name>
          . Applied Software Measurement:
          <article-title>Global Analysis of Productivity and Quality, Third Edition</article-title>
          .
          <source>McGraw Hill</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>A.</given-names>
            <surname>Kloptchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eklund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Karlsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Back</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Vanharanta</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Visa</surname>
          </string-name>
          .
          <article-title>Combining data and text mining techniques for analysing financial reports: Research articles</article-title>
          .
          <source>Int. Journ. of Intelligent Systems in Accounting, Finance and Management</source>
          ,
          <volume>12</volume>
          :
          <fpage>29</fpage>
          -
          <lpage>41</lpage>
          ,
          <year>January 2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>Philippe</given-names>
            <surname>Kruchten</surname>
          </string-name>
          .
          <article-title>The 4+1 view model of architecture</article-title>
          .
          <source>IEEE Software</source>
          ,
          <volume>12</volume>
          (
          <issue>5</issue>
          ):
          <fpage>88</fpage>
          -
          <lpage>93</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>M.M.</surname>
          </string-name>
          <article-title>Lehman. Programs, life cycles, and laws of software evolution</article-title>
          .
          <source>In Proceedings IEEE</source>
          , volume
          <volume>68</volume>
          , pages
          <fpage>1060</fpage>
          -
          <lpage>1976</lpage>
          ,
          <year>1980</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Hsin-Min</surname>
            <given-names>Lu</given-names>
          </string-name>
          , Hsinchun Chen,
          <string-name>
            <surname>Tsai-Jyh</surname>
            <given-names>Chen</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mao-Wei</surname>
            <given-names>Hung</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Shu-Hsing Li</surname>
          </string-name>
          .
          <article-title>Financial text mining: Supporting decision making using web 2.0 content</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          , pages
          <fpage>78</fpage>
          -
          <lpage>82</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Sandro</surname>
            <given-names>Morasca</given-names>
          </string-name>
          , Davide Taibi, and
          <string-name>
            <given-names>Davide</given-names>
            <surname>Tosi</surname>
          </string-name>
          .
          <article-title>Towards certifying the testing process of open-source software: New challenges or old methodologies? In Procs</article-title>
          . FLOSS'
          <volume>09</volume>
          , pages
          <fpage>25</fpage>
          -
          <lpage>30</lpage>
          , Washington, DC, USA,
          <year>2009</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Li</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Su N.</given-names>
            <surname>Kim</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Timothy</given-names>
            <surname>Baldwin</surname>
          </string-name>
          .
          <article-title>Thread-level analysis over technical user forum data</article-title>
          .
          <source>In Procs. of the Australasian Language Technology Association Workshop 2010</source>
          , pages
          <fpage>27</fpage>
          -
          <lpage>31</lpage>
          , Melbourne, Australia,
          <year>December 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>James R. Williams</surname>
          </string-name>
          , Davide Di Ruscio, Juri Di Rocco, and
          <string-name>
            <surname>Dimitrios</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Kolovos</surname>
          </string-name>
          .
          <article-title>Models of OSS Project Meta-Information: A Dataset of Three Forges</article-title>
          .
          <source>In MSR2014 at ICSE2014</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Zanasi</surname>
          </string-name>
          .
          <article-title>Text Mining and its Applications to Intelligence, CRM and Knowledge Management</article-title>
          . WIT Press,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>