<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>and Renata S. S. Guizzardi. An ontology of online user feedback
in software engineering. Journal of Applied Ontology</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Research on NLP for RE at the FBK-Software Engineering Research Line: A Report</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>11] Hamish Cunningham</institution>
          ,
          <addr-line>Diana Maynard, Kalina Bontcheva, Valentin Tablan, Niraj Aswani, Ian Roberts, Genevieve Gorrell, Adam Funk, Angus Roberts, Danica Damljanovic, Thomas Heitz, Mark A. Greenwood, Horacio Saggion, Johann Petrak, Yaoyong Li, and Wim Peters. Text Processing with GATE (Version 6). 2011</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Anna Perini Fondazione Bruno Kessler</institution>
          ,
          <addr-line>FBK</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Fitsum Meshesha Kifetew Fondazione Bruno Kessler</institution>
          ,
          <addr-line>FBK</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Itzel Morales-Ramirez. Exploiting Online User Feedback in Requirements Engineering. PhD thesis, University of Trento, Italy</institution>
          ,
          <addr-line>2015</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2075</year>
      </pub-date>
      <volume>2075</volume>
      <fpage>4</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>In this short report paper, we introduce the Software Engineering research unit at Fondazione Bruno Kessler, and summarise the research carried out in the area of requirements engineering for which natural language processing techniques have been exploited to build tools at support of software engineers. Ongoing and longer term research objectives are brie y outlined.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Overview</title>
      <p>Validation and
veri cation of
requirements
speci</p>
      <p>cations
Stakeholder
feedback analysis for
software
maintenance and
evolution in OSS
Requirements
management
Requirements
management for
software evolution
[CRST12,
CRST11]
[MPC14,
MRKP18]
[MRKP18]
[KPS18]
Elicitation of
Requirements'
relevant
information</p>
      <p>Generating semi-formal requirements speci cation from requirements descriptions expressed
as unstructured text.</p>
      <p>The rst activity (summarized in the rst row in Table 1) refers to the work on the formalization of the European
railways signaling system speci cations that has been done in the context of the EURAILCHECK ERA project3.
A key result of this work is a methodology for the analysis and veri cation of requirements speci cations that
consists of three main steps. The rst step is devoted to the identi cation of speci c patterns in the requirements
speci cation documents, consisting of tens of pages of unstructured textual descriptions. This step allows to
identify di erent categories of speci cations such as glossary, describing an entity of the railway domain, or
behavior, describing the behavior of the entities in the domain. Dependency relationships among those categories
are also identi ed. The output of this step is a set of categorized requirements fragments. The second step
focuses on the formalization of requirements fragments into Linear Temporal Logic (LTL) formulae. The output
of this step is a set of formalized requirements fragments. Finally, the third step has the purpose to enable
the veri cation and validation of the formalized requirements fragments (e.g. portions of the signaling system's
speci cations) via formal techniques, such as model checking [CRST12, CRST11].</p>
      <p>The implemented approach includes a set of rules for the manual annotation and categorization of textual
chunks, and for the translation of these chunks into LTL formulae. For example, \The speed of the Train at the
end of the Movement Authority must be 0 " is a chunk of speci cations that involves the domain entities Train
and Movement Authority and describes a behavioral property of the train, which is translated into the formal
counterpart: \always (Train.position = MovementAuthority.end implies Train.speed = 0)". In case the
modelchecking identify inconsistencies associated to formalized requirement fragments, they can be traced back to the
corresponding categorized requirement fragments, and then up to the original informal requirements, so that the
requirements engineer can address the identi ed aw.
2.2</p>
      <p>Automated analysis of online discussions for RE purposes.</p>
      <p>The research work on requirements elicitation from online discussions was started in the context of a PhD
work [Mor15], and it has been further extended in the SUPERSEDE project4, when developing automated
anal3https://es-static.fbk.eu/projects/eurailcheck/
4An EU H2020 funded research and innovation project, https://www.supersede.eu
ysis techniques of user-feedback for RE purposes [MRKP18]. The objective of the PHD research was to
investigate if online discussions about software, such as open-source software (OSS) user forums, contain requirements
relevant information, which could be useful for improving the software; and to develop a novel linguistic
technique to automate the analysis of large online discussion datasets in English. Among the main results of this
work are an ontology of online user-feedback [MPG15] and a new linguistic technique for the analysis of online
discussions which is based on speech-act theory [MPC14]. The speech-acts considered for the proposed technique
are taken from a taxonomy originally proposed by Bach and Harnish [BH79], for the linguistic community, that
we revised and adapted for the software development domain. The speech-act based analysis technique builds
on the hypothesis that textual messages sent by software users are meant to suggest a new feature, or to request
enhancements of an existing functionality, or simply to complain. The idea is that the speech-acts contained in
these messages can be used as indicators for identifying the di erent types of user requests. For instance, from
an experiment on a dataset of 161; 120 messages from online discussions in the Apache OpenO ce issue tracking
system, we found that speech-acts of types Requestive, Requirement, and Accept are strongly present in
userfeedback messages concerning the request of new functionalities or the enhancement of existing ones. The
implementation of the speech-act based analysis technique for online discussions includes: i) preprocessing steps,
which extract online discussion threads from the issue tracking system as XML les which are cleaned, parsed
and stored in a MySQL database; ii) a set of 142 lexico-syntactic rules for the annotation of twenty
speechacts, which exploit gazetteers containing verbs related to each speech-act. The rules are implemented in GATE
[CMB+11]; iii) properties extracted from the resulting annotated textual messages, which are used for building
a classi er that provides an automatic classi cation of feedback as bug report, enhancement or new feature
requests. For the classi cation, three ML algorithms (J48, SMO, and Random Forest) available in Weka5 library
have been used. These three algorithms were used because performing a preliminary assessment revealed that
for the given dataset they gave the best performances.
2.3</p>
      <p>Automated analysis of online user-feedback for software maintenance and evolution.
A follow-up of the above described research was developed in the context of the SUPERSEDE project, whose
overall objective was the development of a highly con gurable toolsuite that enables a user-feedback driven
approach to software evolution. To realize such a software evolution approach, one key problem concerns supporting
a development team in eliciting bug- x, new feature or feature enhancement requests from large sets of textual
user-feedback. To address this problem, we developed a user-feedback analysis tool that combines the previously
mentioned speech-act based analysis technique with sentiment analysis, aimed at automating the analysis of large
sets of textual feedback. We obtained encouraging results when validating it on the project's industrial use cases,
in particular feedback messages from end users of a household energy management software application. For
example, when applied to a dataset containing 575 feedback messages (translated to English from German, as the
software has a predominantly German speaking user base) from the users of the energy management software,
enhancement requests were classi ed with an accuracy of 86% precision, and 77% recall (F-measure 0.81) [MRKP18].</p>
      <p>In order to avoid the e ort demanding task of translating messages from German to English, we developed a
classi er for textual feedback expressed in German. The classi er has been trained on a dataset containing 600
German messages, which have been annotated with respect to sentiment and feedback category by the software
developers. The accuracy of the resulting classi er amounts to 59% [KPS18].
3</p>
    </sec>
    <sec id="sec-2">
      <title>Ongoing Research Exploiting NLP for RE</title>
      <p>Based on our experience in the SUPERSEDE project we are currently implementing and validating a tool that
will support user-feedback driven requirements prioritization. In a nutshell, the idea is to identify which software
requirements are a ected by user-feedback, by exploiting text similarity techniques enriched with domain
ontologies. An estimate of the value perceived by users of the software requirements is computed by using properties of
the user-feedback that we can compute with the analysis techniques we developed in SUPERSEDE [MRKP18].
Among such properties we consider are: topics, sentiment, intention of the user as determined by the
speechacts, and severity of the issue reported in the feedback.</p>
      <p>The longer term research objective is that of integrating this tool in multi-criteria requirements prioritization
tools, such as the one we developed in SUPERSEDE [KMP+17]. A research preview is presented in [MMK+17].
5https://www.cs.waikato.ac.nz/ml/weka/</p>
      <p>A related research is ongoing in the App development domain, where key artifacts are users' App
reviews [DLPS19]. Also in this case the idea is to focus on RE decision-making tasks, which can be supported by
automated analyses of textual feedback. We will combine NLP techniques to process App reviews with
unsupervised learning techniques, design and execute empirical studies with intended users of the proposed tool, i.e.
software developers or requirements engineers, to evaluate its e ectiveness and usefulness. As a longer term
objective, we are considering to revisit our research on conceptual modeling for RE within data-driven engineering
perspective. Speci cally, we intend to investigate how to exploit the analysis of textual artifacts expressed in natural
languages (e.g. documents, and stakeholders' feedback) to elicit model elements as well as to validate models.
[BH79]</p>
      <p>Kent Bach and Robert M. Harnish. Linguistic Communication and Speech Acts. MIT Press,
Cambridge, MA, 1979.
[CRST11]
[CRST12]</p>
      <p>Alessandro Cimatti, Marco Roveri, Angelo Susi, and Stefano Tonetta. Formalizing requirements
with object models and temporal constraints. Software and System Modeling, 10(2):147{160, 2011.
Alessandro Cimatti, Marco Roveri, Angelo Susi, and Stefano Tonetta. Validation of requirements
for hybrid systems: A formal approach. ACM Trans. Softw. Eng. Methodol., 21(4):22:1{22:34, 2012.
Jacek Dabrowski, Emmanuel Letier, Anna Perini, and Angelo Susi. Finding and analyzing app
reviews related to speci c features: A research preview. In To appear in 24th International
Conference on Requirements Engineering: Foundation for Software Quality (REFSQ 2019), 2019.
[Mor15]
[MPC14]</p>
      <p>Itzel Morales-Ramirez, Anna Perini, and Mariano Ceccato. Towards supporting the analysis of online
discussions in OSS communities: A speech-act based approach. In Information Systems Engineering
in Complex Environments - CAiSE Forum 2014, Thessaloniki, Greece, June 16-20, 2014, Selected
Extended Papers, pages 215{232, 2014.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>