Research on NLP for RE at CNR-ISTI: a Report

                                    Stefania Gnesi                                 Alessio Ferrari
                                      ISTI-CNR                                       ISTI-CNR
                                       Pisa, Italy                                    Pisa, Italy
                               stefania.gnesi@isti.cnr.it                     alessio.ferrari@isti.cnr.it


                                                                 Abstract

                          [Team Overview] The Formal Methods & Tools (FMT) group of CNR-ISTI
                          focuses on the study and development of formal methods and tools to support
                          software development processes. [Past Research] FMT started working on
                          requirements formalisation through natural language processing (NLP) at the
                          end of the nineties. This stream of research evolved into requirements anal-
                          ysis and defect detection by means of NLP, with a focus on ambiguity, and
                          resulted in the development and application of the QuARS tool for require-
                          ments analysis. More recently, the group started working on the analysis of
                          requirements elicitation interviews, in which ambiguity in spoken natural lan-
                          guage and other communication defects are studied. [Research Plan] In the
                          upcoming years, FMT will devote its effort to the diffusion of a dataset for
                          requirements analysis, to the usage of NLP in product line engineering, and
                          to research in NLP techniques applied to the analysis of interviews.


1    Team Overview
The Formal Methods & Tools (FMT) group (http://fmt.isti.cnr.it/) of the Institute of Information Science and Technolo-
gies, National Research Council (CNR-ISTI) is active in the fields of development and application of formal notations,
methods and software support tools for the specification, design and verification of complex computer systems. These
efforts are based mostly on the foundational concepts and techniques of process algebras, temporal logics, model checking
approaches. Complementary to these more foundational research activities, natural language processing techniques (NLP)
have been applied to the analysis of natural language (NL) requirements. The original activity on the application of NLP
techniques to NL requirements started in the nineties with an attempt of translation of NL sentences describing safety and
liveness properties into temporal logic formulae, to provide a help in their expression, which is not always an easy task due
to the ambiguity than can be hidden in the NL description of the properties.
Starting from the experience done in the formalization of NL requirements, we moved to the applications of NLP tech-
niques to the analysis of requirements documents, standards descriptions and public administration documents with the
purpose of detecting linguistic defects that affect the quality of these documents, complementing this activity with the
development of tools for performing the automatic analysis of NL defects. More recently, the experience acquired in NLP
was used in product line engineering to devise techniques to identify commonalities and variabilities in NL requirements
documents and we also started researching on linguistic problems in requirements elicitation interviews.
The remainder of the paper is structured as follows. In Sect. 2 we present the research done in our laboratory in the past
years, and in Sect. 3 we outline the directions of research that we plan to explore.

Copyright c 2018 by the paper’s authors. Copying permitted for private and academic purposes.
2     Past Research on NLP for RE
2.1   Formalization of NL Requirements

The first approach in the application of NLP techniques in requirements engineering at FMT was the development of a
prototype assistant, NL2ACTL [FAN94], for providing an automatic translation of NL sentences expressing properties of
reactive systems, into formulae of an action-based temporal logic ACTL. NL2ACTL was realized using a general devel-
opment environment, PDGE, for NLP and it was interfaced with a verification environment for the check of behavioural
and logical properties of reactive systems. NL2ACTL was supported by a grammar which embodies the knowledge of par-
ticular conventions which allows the informative gaps of jargon expressions to be completed, while avoiding ambiguities.
Moreover, it was built with the purpose of recognizing as many sentences as possible, and of defining a semantics for each
sentence removing ambiguities in the properties description. NL2ACTL was implemented in Common Lisp running on a
Macintosh SE 30 and it is no longer available.


2.2   Quality Analysis of NL Requirements: Techniques and Tools

NL requirements are widely used in software industry, at least as the first level of description of a system. Unfortunately
they are often prone to errors and this is partially caused by interpretation problems due to the use of NL itself. An
evaluation of NL requirements to address part of the interpretation problems due to linguistic problems was considered an
interesting research problem. However, as any other evaluation process, the quality evaluation of NL software requirements
needs the definition of a quality model.
    Working in collaboration with Vincenzo Gervasi and Salvatore Ruggieri, we distinguished four quality types, namely
syntactic, structural, semantic, and pragmatic [FAB98, FAB00, FAB01a, FAB01b]. The quality model was successively
refined [BER06] to include those ambiguities, which were not already included in the initial quality model, described by
Berry, Kamsties, and Krueger [BER03].
    The quality model was the basis for implementing a tool, called QuARS 1 – Quality Analyzer for Requirement Spec-
ifications, for analyzing NL requirements in a systematic and automatic way [GNE05]. QuARS allows the requirements
engineers to perform an initial parsing of the requirements for automatically detecting potential linguistic defects that can
determine interpretation problems at the subsequent development stages of a software product. This tool is also able to
partially support the consistency and completeness analysis by clustering the requirements according to specific topics
[FAB04]. The quite simple clustering approach of QuaRS was later improved by the application of a clustering algorithm
to exploit lexical and syntactic relationships occurring between natural language requirements for grouping together similar
requirements contained in a requirements document [FER12]. The QuARS approach, oriented to requirements, was later
extended to address defects in public administration documents, leading to the development of online tool QuOD (Quality
checker for Official Documents)2 .
    The approach provided by QuARS is mainly focused on lexical and syntactic quality aspects, while the pragmatic
aspect, which depends on the reader of the requirements, is not taken into account. To address the challenge of pragmatic
ambiguity, we proposed a novel algorithm that, based on the automatic construction of knowledge graphs that represent
the domain-specific information of different subjects, is able to identify potential reader-dependent discrepancies in the
interpretation of the same requirements [FER12], [FER14b]. More recently, with the use of Wikipedia crawling and word-
embeddings, we devised an approach for measuring the degree of ambiguity of computer science terms (e.g., system,
interface, code), when they are used in different domains. The approach, experimented on five different domains, paves
the basis for domain-specific ambiguity detection.
    The quality of NL requirements is not limited to linguistic defects, but is also affected by the structure of the specifica-
tion, and by its completeness. In [FER13b], we proposed a novel clustering algorithm named Sliding Head-Tail Component
(S-HTC), to improve the structure of requirements specifications, and we applied it to a requirements standard of the rail-
way domain (583 requirements), showing the potential of the approach. In [FER14c], we defined a method to measure and
improve the completeness of requirements documents, with respect to the input documents of the requirements definition
phase, such as preliminary specifications, transcripts of meetings with the customers, etc.. The method is based on the
automatic identification of concepts and relationships in the input documents, and in the assessment that such elements are
transferred to the requirements document.

    1 http://quars.isti.cnr.it/
    2 http://narwhal.it/quod/index.html
2.3   Quality Analysis of NL requirements: Application Experiences
Use cases are powerful tools to capture functional requirements for software systems. They allow structuring requirements
according to user goals, and provide a means to specify the interaction between a certain software system and its environ-
ment. As part of the ITEA project CAFFÉ [FAN02], we initiated with Nokia a collaboration on the use of methods based
on a linguistic approach with the aim to collect metrics and perform a qualitative analysis on the natural-language-based
use case modelling technique used by the company to specify functional requirements for the mobile phone software user
interface [FAN02] [FAN03].
    Empirical experiments to assess the impact in terms of effectiveness and efficacy of the automation in the requirements
review process of a software company are need to evaluate to usability and applicability of NLP based tools for the quality
analysis of textual requirements [LAM07]. Three different experiments may be cited in this direction. One concerns the
use of QuARS, in collaboration with Siemens CNX R&D Labs, on telecommunication requirement documents [BUC05].
In [BUC08] and [ROS17] two different experiences in the application of NLP techniques have been developed to auto-
matically identify quality defects in natural language requirements in the Railway Domain. In [BUC08] an customization
of QuARS, QuARS Express was used to evaluate the quality of a large set of requirements developed in a EU project. In
[ROS17] we report the experience done within a collaboration between a world-leading railway signalling company, the
University of Florence, and ISTI-CNR to investigate the feasibility of using NLP for defect identification in the require-
ments documents of the company. The experience shows that existing rule-based NLP approaches need to be incrementally
tailored to the specific language of a company to achieve a sufficient degree of accuracy.

2.4   Analysing NL requirements to Identify of Commonalities and Variabilities in Product Lines
NL documents can be a precious source to identify variability information and this information can be later used to define
feature models from which different systems can be instantiated and a company who wishes to enter an established marked
with a new, competitive product is required to analyse the product solutions of the competitors from available documen-
tation as for example brochures. Identifying and comparing the features provided by the other vendors might greatly help
during the market analysis. In this context, we have devised and applied a strategy, based on advanced information extrac-
tion NLP technologies, for mining common and variant features from available documents belonging to different vendors
[FER13a, FER14a, FER15].
    Still in the field of product line engineering, we used NLP techniques to identify variation points in single requirements
documents, based on the following rationale. NL is intrinsically ambiguous, and this is seen often as a possible source of
problems in the later interpretation of requirements. However, ambiguity or underspecification at requirements level can in
some cases give an indication of possible variability, either in design choices, in implementation choices or configurability.
Taking into account the results of our previous analyses conducted on different requirements documents with NLP analysis
tools, a first classification of the forms of ambiguity that indicate variation points have been proposed starting from the
analysis of documents describing real systems [FAN17]. One of the takeaway messages of [FER13a, FER14a, FER15,
FAN17] was that the ambiguity defects found in NL requirement documents may be a means to extract variability issues.
The underlying intuition is that often ambiguity in requirements is due to the (conscious or subconscious) need to postpone
choices for later decisions in the implementation of the system.

2.5   Requirements Elicitation Interviews
Ambiguity and communication problems in requirements engineering do not affect solely the requirements documentation
activity, but also requirements elicitation. Interviews are the most common and effective means to perform requirements
elicitation, and in [FER15] we started studying the phenomenon of ambiguity in requirements elicitation interviews. Based
on arranged interviews, we categorised the different types of ambiguities that were observed, and we proposed a novel
classification of ambiguity, to take into account the pragmatic, subject-dependent facet of the phenomenon. The cate-
gorisation was later extended in [FER16b]. Stemming from this work, we identified which were the dominant cues of
ambiguity [FER16a], showing that under-specified terms (e.g., system, area) were dominant with respect to, e.g., vague
terms and pronouns, more common sources of ambiguity in written documents. We also noticed that half of the cases en-
countered in practice were not triggered by single terms, while entire sentences, and their communication context, should
be taken into account to explain them. Therefore, in [FER17e], we proposed to use argumentation theory to explain these
cases in a formal way.
   One of the take away messages of [FER15] was that ambiguity in interviews could be seen as a resource to disclose tacit
knowledge. Indeed, the occurrence of an ambiguity might reveal the presence of unexpressed, system-relevant knowledge
that needs to be elicited. To leverage the usefulness of ambiguity, in [FER17b] we proposed a method to review interview
recordings, spotting out ambiguities that were not discovered during the interview. We also showed that different subjects
identify highly different ambiguities in interviews. Within this streamline of research, we also went beyond ambiguity, and
looked also at other defects in interviews, focusing on the mistakes that are committed by student analysts [FER17a].
   Although the work on interviews is not strictly NLP-related, it paves the theoretical bases for future studies in speech
processing (see Sect. 3), which may be made possible by the recent, and future, advances in this field.

3    Research Plan on NLP for RE
Given the recent advances in NLP technologies, FMT is in the first line to further apply and tailor available approaches to
the requirements engineering context and our current and planned research activity mainly concerns (a) the delivery of a
complete dataset of requirements documents to ease NLP experiments, (b) the assessment of techniques for commonality
and variability identification, and (c) the study of NLP and speech processing techniques in the context of interviews.
a) As highlighted in our recent vision paper [FER17c], one of the major challenges posed by novel NLP technologies,
   mostly based on machine learning, is the need for large datasets, which are required for training and testing ma-
   chine learning algorithms. Therefore, we have defined a requirements dataset named PURE (PUblic REquirements
   dataset) [FER17f], and made it available to the community3 . We plan to incrementally extend the dataset, also based on
   the contribution of other researchers, and we are formatting the different documents available to reach a common XML
   format.
b) Another line of research that we are pursuing is related to the application of ambiguity detection techniques to the
   discovery of variation points in requirements documents. Specifically, we have recently applied the approach described
   in [FAN17] in three documents belonging to different domains [FAN18], and we are currently performing a more
   extensive, and empirically grounded, experimentation to validate the approach.
c) Requirements elicitation interviews are another NLP-related topic of research to which we are devoting our efforts.
   Specifically, we performed an empirical evaluation of the interview review approach presented in [FER17b] in collabo-
   ration with Kennesaw State University and University of Sydney, and the results have been recently accepted for publi-
   cation [SPO18]. A long-term research direction in interview analysis is the application of NLP and speech processing
   to interview transcripts and recordings. Specifically, we wish to automatically identify conversation topics in interview
   transcripts, by leveraging technologies for information extraction that we already applied in requirements [FER14c].
   This will be useful to retrieve requirements sources later in the development process, when requirements are formalised
   into documents. It could be used also in those contexts in which requirements are never documented, and directly pass
   from their spoken form to software. Of course, this basic idea requires interviews to be transcribed, which is a time
   consuming process. In August 2017, Microsoft Research published a conversational speech recognition system that
   achieves an error-rate that is comparable to human transcribers [XIO17]. We argue that these advances can be lever-
   aged by our future research in NLP for RE, and will enable a more in-depth understanding, and consequent control, of
   the process of transforming ideas into spoken sentences, written requirements, and, finally, software.

References
[FAN94] A. Fantechi, S. Gnesi, G. Ristori, M. Carenini, M. Vanocchi, P. Moreschini: Assisting Requirement Formalization by Means
    of Natural Language Translation. Formal Methods in System Design 4(3): 243-263, Springer,1994.
[FAB98] F. Fabbrini, M. Fusani, V. Gervasi, S. Gnesi, S. Ruggieri: On Linguistic Quality of Natural Language Requirements. 4th
    REFSQ, 57-62, Presses Universitaires de Namur, 1998.
[FAB00] F. Fabbrini, M. Fusani, S.Gnesi, G. Lami: Software requirements verification by natural language analysis: a CNR initiative
    for italian SME’s. In: Ercim News, vol. 40, 52 - 53, 2000.
[FAB01a] F. Fabbrini, M.Fusani, S.Gnesi, G. Lami: The linguistic approach to the natural language requirements quality: benefit of the
    use of an automatic tool, 26th Annual NASA Software Engineering Workshop, 97-105, IEEE, 2001.
[FAB01b] F. Fabbrini, M. Fusani, S. Gnesi, G. Lami, An automatic quality evaluation for natural language requirements, 7th REFSQ,
    2001.
[GNE05] S. Gnesi, G. Lami, G. Trentanni: An automatic tool for the analysis of natural language requirements. Computer. Systems:
    Science & Engineering. 20(1), CRL Publishing, 2005.
[FAB04] F. Fabbrini, M. Fusani, S. Gnesi, G. Lami: Automatic clustering of non-functional requirements. IASTED Conf. on Software
    Engineering and Applications 2004, 672-677, IASTED/ACTA, 2004.
    3 http://fmt.isti.cnr.it/nlreqdataset/
[BER06] D.M Berry, A.Bucchiarone, S. Gnesi, G. Lami, G. Trentanni, A new quality model for natural language requirements specifi-
    cations, 12th REFSQ, 2006.
[FER12] A. Ferrari, S. Gnesi, G. Tolomei, A clustering-based approach for discovering flaws in requirements specifications. SAC,1043-
    1050, ACM, 2012.
[FAN02] A. Fantechi, S. Gnesi, G. Lami, A. Maccari: Application of Linguistic Techniques for Use Case Analysis. 10th RE: 157-164,
    IEEE, 2002.
[FAN03] A. Fantechi, S. Gnesi, G. Lami, A. Maccari: Applications of linguistic techniques for use case analysis. Requir. Eng. 8(3):
    161-170, Springer, 2003.
[LAM07] G. Lami, R. W. Ferguson: An empirical study on the impact of automation on the requirements analysis process. In: Journal
    of Computer Science and Technology, vol. 22 (3) pp. 338 - 347. Springer, 2007.
[BUC05] A. Bucchiarone, S. Gnesi, P. Pierini: Quality Analysis of NL Requirements: An Industrial Case Study. 13th RE: 390-394,
    IEEE, 2005.
[BUC08] A. Bucchiarone, S. Gnesi, G. Trentanni, A. Fantechi: Evaluation of Natural Language Requirements in the MODCONTROL
    Project, ERCIM News 2008(75), 2008.
[ROS17] B. Rosadini, A.Ferrari, G. Gori, A. Fantechi, S. Gnesi, I. Trotta, S. Bacherini: Using NLP to Detect Requirements Defects:
    An Industrial Experience in the Railway Domain. 23rd REFSQ, LNCS 10153, 344-360, Springer 2017.
[FER13a] A. Ferrari, G. O. Spagnolo, F. Dell’Orletta: Mining commonalities and variabilities from natural language documents. 17th
    SPLC:116-120, ACM, 2013.
[FER14a] A. Ferrari, G. O. Spagnolo, G. Martelli, S. Menabeni: From commercial documents to system requirements: an approach for
    the engineering of novel CBTC solutions. STTT 16(6): 647-667, Springer, 2014.
[FER15] A. Ferrari, G.O.Spagnolo, S. Gnesi, F. Dell’Orletta: CMT and FDE: tools to bridge the gap between natural language docu-
    ments and feature diagrams. 19th SPLC: 402-410, ACM, 2015.
[FAN17] A. Fantechi, S. Gnesi, L. Semini: Ambiguity defects as variation points in requirements. 11th VaMoS: 13-19, ACM, 2017.
[FAN18] A. Fantechi, A. Ferrari, S. Gnesi, L. Semini: Hacking an Ambiguity Detection Tool to Extract Variation Points: an Experience
    Report, 12th VaMoS: 43-50, ACM, 2018.
[FER15] A. Ferrari, P. Spoletini, S. Gnesi: Ambiguity as a resource to disclose tacit knowledge. 23rd RE: 26-35, IEEE, 2015.
[FER16a] A. Ferrari, P. Spoletini, S. Gnesi: Ambiguity Cues in Requirements Elicitation Interviews. 24th RE: 56-65, IEEE, 2016.
[FER17a] B. Donati, A, Ferrari, P. Spoletini, S. Gnesi: Common Mistakes of Student Analysts in Requirements Elicitation Interviews.
    23rd REFSQ, LNCS 10153, 148-164, Springer,2017.
[FER16b] A. Ferrari, P. Spoletini, S. Gnesi: Ambiguity and tacit knowledge in requirements elicitation interviews. Requir. Eng. 21(3),
    333-355, Spinger, 2016.
[FER17b] A. Ferrari, P. Spoletini, B. Donati, D. Zowghi, S. Gnesi: Interview Review: Detecting Latent Ambiguities to Improve the
    Requirements Elicitation Process. 25th RE, 400-405, IEEE, 2017.
[FER12] A. Ferrari, S. Gnesi: Using collective intelligence to detect pragmatic ambiguities. 20th RE, 191-200, IEEE, 2012.
[FER14b] A. Ferrari, G. Lipari, S. Gnesi, G. O. Spagnolo: Pragmatic ambiguity detection in natural language, AIRE 2014, 1-8, IEEE,
    2014.
[FER13b] A. Ferrari, S. Gnesi, G. Tolomei: Using Clustering to Improve the Structure of Natural Language Requirements Documents.
    19th REFSQ, LNCS 7830, 34-49, Springer, 2013.
[FER14c] A. Ferrari, F. Dell’Orletta, G. O. Spagnolo, S. Gnesi: Measuring and Improving the Completeness of Natural Language
    Requirements. 20th REFSQ: 23-38, LNCS 8396, Springer, 2014.
[FER17c] A. Ferrari, F. Dell’Orletta, A. Esuli, V. Gervasi, S. Gnesi: Natural Language Requirements Processing: A 4D Vision. IEEE
    Software 34(6): 28-35, IEEE, 2017.
[FER17d] P. Spoletini, A. Ferrari: Requirements Elicitation: A Look at the Future Through the Lenses of the Past. 25th RE 2017,
    476-477, IEEE, 2017.
[FER17e] Y. Elrakaiby, A. Ferrari, P. Spoletini, S. Gnesi, B. Nuseibeh: Using Argumentation to Explain Ambiguity in Requirements
    Elicitation Interviews. 25th RE: 51-60, IEEE, 2017.
[BER03] D. M. Berry, E. Kamsties, M. M. Krieger: From Contract Drafting to Software Specification: Linguistic Sources of Ambiguity.
    University of Waterloo, 2017. https://cs.uwaterloo.ca/˜dberry/handbook/ambiguityHandbook.pdf
[FER17f] A. Ferrari, G. O. Spagnolo, S. Gnesi: PURE: A Dataset of Public Requirements Documents. 25th RE: 502-505, IEEE, 2017.
[SPO18] P. Spoletini, A.Ferrari, M. Bano, D. Zowghi, S. Gnesi: Interview Review: an Empirical Study on Detecting Ambiguities in
    Requirements Elicitation Interviews. 24th REFSQ, to appear.
[XIO17] W. Xiong, L. Wu, F. Alleva, J. Droppo, X. Huang, A. Stolcke: The Microsoft 2017 Conversational Speech Recognition System.
    Microsoft Technical Report MSR-TR-2017-39, 2017. https://arxiv.org/abs/1708.06073.