=Paper= {{Paper |id=Vol-3122/NLP4RE-paper-3 |storemode=property |title=Using NLP Tools to Detect Ambiguities in System Requirements - A Comparison Study |pdfUrl=https://ceur-ws.org/Vol-3122/NLP4RE-paper-3.pdf |volume=Vol-3122 |authors=Aleksandar Bajceta,Miguel Leon,Wasif Afzal,Pernilla Lindberg,Markus Bohlin |dblpUrl=https://dblp.org/rec/conf/refsq/BajcetaLALB22 }} ==Using NLP Tools to Detect Ambiguities in System Requirements - A Comparison Study== https://ceur-ws.org/Vol-3122/NLP4RE-paper-3.pdf
Using NLP Tools to Detect Ambiguities in System
Requirements - A Comparison Study
Aleksandar Bajceta1 , Miguel Leon1 , Wasif Afzal1 , Pernilla Lindberg2 and
Markus Bohlin1
1
    Mälardalens universitet, Box 325, 631 05 Eskilstuna
2
    Alstom Sweden, 721 14 Västerås


                                         Abstract
                                         Requirements engineering is a time-consuming process, and it can benefit significantly from automated
                                         tool support. Ambiguity detection in natural language requirements is a challenging problem in the
                                         requirements engineering community. Several Natural Language Processing tools and techniques
                                         have been developed to improve and solve the problem of ambiguity detection in natural language
                                         requirements. However, there is a lack of empirical evaluation of these tools. We aim to contribute the
                                         understanding of the empirical performance of such solutions by evaluating four tools using the dataset
                                         of 180 system requirements from the electric train propulsion system provided to us by our industrial
                                         partner Alstom. The tools that were selected for this study are Automated Requirements Measurement
                                         (ARM), Quality Analyzer for Requirement Specifications (QuARS), REquirements Template Analyzer
                                         (RETA), and Requirements Complexity Measurement (RCM). Our analysis showed that selected tools
                                         could achieve high recall. Two of them had the recall of 0.85 and 0.98. But they struggled to achieve high
                                         precision. The RCM, an in-house developed tool by our industrial partner Alstom, achieved the highest
                                         precision in our study of 0.68.

                                         Keywords
                                         Requirements engineering, Natural language requirements, Ambiguity, Natural language processing




1. Introduction
Good requirements engineering is the backbone of every successfully developed system. It is a
process where engineers define the stakeholders’ needs and their transformation into clearly
described systems’ behavior. Many projects have failed due to a poor requirements engineering
process. According to [1], bad requirements engineering is the cause in 71% of software project
failures.
   This paper considers a comparison study for ambiguity analysis applied to the requirements
in a safety-critical application in the railway domain. The development of safety-critical systems,
such as those used in the railway industry, must follow international standards, and they demand
that requirements documents should be complete, clear, precise, unequivocal, verifiable, testable,
maintainable, and feasible [2].

NLP4RE 2022: 5th Workshop on Natural Language Processing for Requirements Engineering @ REFSQ, Birmingham, UK
(CEUR workshop): https://nlp4re.github.io/2022/
$ aleksandar.bajceta@mdh.se (A. Bajceta); miguel.leonortiz@mdh.se (M. Leon); wasif.afzal@mdh.se (W. Afzal);
pernilla.lindberg@alstomgroup.com (P. Lindberg); markus.bohlin@mdh.se (M. Bohlin)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
   Most of the requirements documents are written in some form of natural language [3]. It is
easy to write and understand natural language requirements, but they can be more ambigu-
ous and error-prone. The ambiguous requirement could be understood differently between
developers, testers, and other stakeholders involved in project development. The occurrence of
underspecified or abstract requirements that can be interpreted differently is one of the most
common requirements engineering problems [4]. This kind of miscommunication could lead to
project delay or failure. Because of that, the identification of ambiguities in natural language
requirements plays a vital role in requirements engineering.
   A system requirement is ambiguous if there is more than one interpretation. Analyzing
requirements for defects and ambiguities is a time-consuming process, and it can benefit from
an automated solution. Using Natural Language Processing (NLP) for requirements analysis has
been a popular research topic. Researchers created several tools for identifying ambiguities or
defects in requirements using pattern or rule-based approaches [2, 5–10]. However, there is still
a lack of industrial evaluation in using these tools in the research papers [11]. Our paper aims
to fill the research gap by evaluating several requirements analysis tools used for ambiguity
detection in natural language requirements in the railway domain.
   Tools that we selected are Automated Requirements Measurement (ARM) [8], Quality Ana-
lyzer for Requirement Specifications (QuARS) [9], REquirements Template Analyzer (RETA) [5],
and Requirements Complexity Measurement (RCM) [10]. We selected the tools based on suit-
ability for the type of systems considered in this paper and their availability. The three first
tools were publicly available for use by anyone, and the fourth tool was developed in-house
at Alstom, and it is also publicly available 1 . Furthermore, they all use a similar approach for
identifying ambiguities, making them a good candidate for comparison studies. A sample set of
180 requirements were provided to us by our industrial partner, Alstom Transport AB in Sweden.
These requirements are used in one of the Alstoms’ projects for defining specifications of a
train propulsion system. After running selected tools on this set of requirements, we compared
the tool’s ambiguity identification results with ambiguities found by an expert responsible for
requirements engineering in Alstom.
   The remainder of the paper is structured as follows. The related work is presented in Section
2. In Section 3, we described the tools used in the analysis. The followed methodology is
described in Section 4. Results of the study are presented in Section 5. We addressed validity
threats in Section 6. Section 7 concluded the paper.


2. Related Work
We identified several papers that compare requirements analysis tools, whose focus is on
reducing ambiguities and improving overall requirements quality.
  In [12] researchers compared their unnamed tool that was in development with two additional
requirements engineering tools, reconstructed ARM tool [8] and TIGER Pro 2.0 2 . The study
was performed on a set of requirements from two projects. Researchers concluded that their
unnamed new tool is still not good enough and that it is identifying many false positive
   1
       https://github.com/eduardenoiu/NALABS/tree/master/RCM
   2
       http://www.therightrequirement.com/TigerPro/TigerPro.html
examples. A similar comparison study was performed in [13], where the authors presented
their experience report using three different tools for requirement analysis. They compared
two commercial tools, Requirement Scout and QVscribe, and one academic tool, QuARS. The
study was performed by running the tools on the set of requirements from a simple E-shop
software. The conclusions were that all three tools are performing quite similarly and that
the two newer tools, Requirement Scout and QVscribe, did not outperform QuARS which was
developed almost 20 years ago.
   Different techniques, tools, and their technologies are reviewed and discussed in [14–17].
In these papers, researchers were reviewing the tools and technologies, and they were not
experimenting on an actual set of requirements. In [14] researchers performed a literature review
focusing on the tools for automatic detection of ambiguities in the requirements. They extracted
25 tools developed between 2008 and 2018. Ten of the most popular tools were compared with
the addressed ambiguity, technologies used, and approaches in detecting ambiguities. In [15]
researchers reviewed different techniques used for ambiguity detection in the NL requirements.
They identified three approaches in detecting ambiguities: manual, semi-automatic using natural
language processing techniques, and semi-automatic using machine learning techniques. They
classified the detection of ambiguities with patterns as a Semi-Automatic approach using NLP.
Similar studies were done in [16], [17]. A literature review was performed on detecting and
resolving ambiguities in the NL requirements. Researchers compared tools and approaches by
different parameters such as used technologies, targeting ambiguity, user interaction, etc.
   In our work, we introduce two tools that were not compared in the literature. Also, tools are
evaluated using requirements that were assessed and used in an industrial project.


3. Selected Tools
The availability of tools for requirements engineering is poor, and most of them are not publicly
available [11]. Public versions of several tools do not contain all features, such as the set of
patterns needed for the tool’s full functionality. We were also unable to run specific tools in our
environment. For our study, we selected four tools. The tools ARM and QuARS are among the
oldest tools for requirement analysis. But they are still quite popular, and they built a foundation
for further research and tools development. RETA is a bit newer tool, and comparison with
older tools is relevant to us. The fourth tool selected for this study is RCM, a tool developed
by Alstom, primarily to measure requirements complexity in Alstom’s internal projects, but
it can be used to detect ambiguities. The focus of these tools is to improve the quality, reduce
complexity, and identify ambiguity in requirements. They automatically detect ambiguous
requirements by comparing requirements with their patterns or rules.

3.1. Automated Requirements Measurement (ARM)
NASA developed the ARM tool in the 1990s. In 2011, it was discontinued, and for our study,
we are using a reconstructed ARM tool developed by Carlson and Laplante [8]. The ARM tool
has several quality indicators that are used for requirement analysis. Those quality indicators
are imperatives, directives, continuances, size, specification depth, text structure, options, and
weak phrases. Imperatives indicate absolute necessity using command words such as “shall“ or
“must“. Continuances are words or phrases that follow imperative words, and their extensive use
could increase requirement complexity. Examples of continuances are "following ", "as follows
", "and", etc. Size includes counts of three indicators: total lines of text, the total number of
imperative words, and the total number of subjects. Specification depth indicates how concise
the document specifies the requirement by calculating the number of imperative statements
at each level of the requirement document’s text structure. The text structure measures the
number of indicators identified at each hierarchical level of the requirement document. Based
on NASA’s requirements quality model, indicators of ambiguity are optional and weak phrases.
The terms for options are “can“, “may“, and “optionally“. These terms loosen the specification,
and they could allow the developer to judge what should be implemented in more than one
way. Weak phrases terms are a set of words that could leave requirements with multiple or
subjective interpretations. There are 12 pre-defined weak phrases in the ARM tool, and some of
those terms are “adequate“, “be able to“, “timely“, “as appropriate“, etc.

3.2. Quality Analyzer for Requirement Specifications (QuARS)
QuARS is a requirement analysis tool that focuses on ambiguity detection developed by Lami et
al. [9]. QuARS performs lexical and syntactic analysis to find ambiguities in natural language
requirements. To perform lexical analysis, QuARS uses a set of patterns or keywords to identify
optionality, subjectivity, vagueness, and weakness in the requirements. Optionality indicates
that the requirement contains an optional part. Examples of terms used to identify optionality
are “possibly“, “eventually“, “optionally“ etc. Subjectivity represents the occurrence of terms that
indicates personal opinion. For example, “having in mind“ or “take into account“. Vagueness
indicates non-quantifiable terms such as “significant“ or “adequate“. Weakness indicates that
the requirement does not have an imperative, and it is identified with terms such as “can“,
“could“, “may“ etc. The goal of syntactic analysis is to find patterns that could identify implicity,
multiplicity, and under-specification among requirements. Implicity occurs when pronouns or
other indirect references represent a subject or object in a sentence. Multiplicity occurs when
the sentence has more than one main verb, subject, or object. Under-specification represents
the occurrence of words that should be instantiated, for example, testing instead of functional
testing or unit testing.

3.3. REquirements Template Analyzer (RETA)
RETA is a requirement analysis tool for automatically checking natural language requirements
against templates for conformance [5]. The tool is implemented as a plugin for an open-source
framework called GATE workbench 3 . The RETA tool supports analysis of the requirements
against two templates, Rupp’s [18] and the EARS templates [19]. Templates represent a require-
ment structure, and when followed, it should reduce ambiguity in the requirement. Also, RETA
tool checks for potentially dangerous terms and phrases that could cause ambiguity. These
terms are looking for constructions such as passive voice, pronouns, quantifiers (terms used for
quantification, such as “all“, “any“, “every“), and/or, vague terms, plural terms, etc.

    3
        https://gate.ac.uk/
3.4. Requirements Complexity Measurement (RCM)
RCM is a tool developed in Alstom used for analyzing requirements complexity [10]. The tool’s
metrics for measuring complexity are the following: number of words, number of vague phrases,
number of conjunctions, number of reference documents, optionality, subjectivity, weakness,
automated readability index, imperatives, and continuances. The number of words is a metric
used to measure the length of the requirement. The number of vague phrases represents the
total number of terms that could cause problems in understanding the requirement, such as
“adequate“, “appropriate“, “normal“, etc. The number of conjunction counts the total number of
phrases such as “and“, “after“, “although“, etc. The number of reference documents indicates a
need for additional reading to understand the requirement that contains references to other
documents. Optionality metrics is implemented in the same way as in the ARM tool, and it uses
keywords such as “can“, “may“, and “optionally“ to identify it. Subjectivity indicates a personal
feeling or opinion in requirement. To identify subjectivity, the tool uses keywords such as
“similar“, “better“, “worse“, etc. Weakness indicates phrases that could introduce uncertainty
into requirements by using keywords such as “timely“, “be able to“, “be capable of“, etc. The
automated readability index is dependent on the total number of words in the requirement
and the average number of letters per word. Imperatives indicate command words, and it is
identified using keywords such as “shall“, “must“, “will“, etc. Continuances are implemented in
the same way as in the ARM tool. Measures selected to identify ambiguity among requirements
are optionality, subjectivity, vague phrases, and weak phrases. We chose these indicators since
Alstom’s engineers identified them during tool development as rules that should be followed to
avoid ambiguity in requirements.


4. Methodology
To evaluate selected tools, we used a dataset provided to us by our industrial partner Alstom.
Alstom is one of the leaders in the railway industry and the largest railway company in Sweden.
The dataset contains 180 requirements, written in the English language, for the train propulsion
system development used in one of their previous projects. The average size of the requirement
is 32.46 words. The shortest requirement contains 8 words, and the longest one has 205 words.
Most of the requirements, or some parts of them, are written in passive form. Requirements
were developed from the customer specifications, and they describe both hardware and software
components of the system.
   We used selected tools described in Section 3 to analyze the requirements and detect potential
ambiguities in them. We compared results from tools’ analysis with the analysis from Alstom’s
engineer, a person responsible for requirements engineering with years of experience in the
railway domain. We asked the engineer to “Identify requirements in the dataset that have defects
and could cause ambiguity among different stakeholders”. If the requirement is potentially
ambiguous, the engineer was supposed to mark it with “yes“ and with “no“ if it is not. We
presented several examples of ambiguous sentences to the engineer. The ambiguous sentences
were developed using definitions and examples from the Ambiguity Handbook [20].

    • “I saw a man at the bank.” - the bank can be a financial institution or ground bordering a
      river
    • “I saw Peter and Paul and Mary saw me.” - can be understood as I saw (Peter and Paul)
      and Mary saw me, or I saw Peter and (Paul and Mary) saw me
    • “All linguists prefer a theory.” - can be understood as all linguists love the same one
      theory, or each linguist loves a, perhaps different, theory
    • “The trucks shall treat the roads before they freeze.” - pronoun “they” can be either trucks
      or roads
    • “Avoid long C functions.” - we do not know the value of “long”

   Results from the engineer’s analysis are taken as a ground truth. If the requirement was
marked as ambiguous by the engineer and it was identified by the tool, it was counted as a true
positive result (TP). A false-positive result (FP) is if the requirement was marked as ambiguous
by the tool but not by the engineer. A false-negative result (FN) is if the requirement was marked
as ambiguous by the engineer but not identified as ambiguous by the tool. We used precision
and recall as performance metrics to evaluate tools since it’s one of the most common metrics for
tools’ evaluation. Precision and Recall for tools are calculated as 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇 𝑃/(𝑇 𝑃 + 𝐹 𝑃 ),
and 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇 𝑃/(𝑇 𝑃 + 𝐹 𝑁 ).


5. Results
Results from the tools’ analysis are presented in Table 1. We group the results by tools’ quality
indicators. If the tool does not support a specific category, the dash symbol “-“ is used in the
table. The ARM tool identified 7 potential issues that fit the option quality indicator category
and 10 for the weak phrase quality indicator. The same results for these two categories came
from the RCM tool. This is expected since both tools use the same terms for these two categories.
The RCM tool also identified 2 issues for subjectivity and 38 issues because of vague terms.
Terms “can“ and “may“ are used by ARM and RCM to identify issues for option quality indicator,
QuARS uses them to identify weakness in requirements, along with additional terms. QuARS
identified 18 issues that could cause weakness in requirements, 5 issues for subjectivity, and 42
vagueness issues. QuARS identified 21 implicity, 116 multiplicity, and 32 underspecification
issues in syntactic analysis. RETA tool identified 56 issues with vague terms in requirements,
42 issues with quantifier terms, 52 pronouns, 66 complex requirements, and 51 adverbs. The
RETA also uses patterns to identify plural terms and passive sentences. Because of it, almost all
requirements were marked as potentially ambiguous. It is important to mention that multiple
issues can be identified in one requirement.
   In the Table 2 results of tool evaluation is presented. We compared tools’ analysis with the
analysis performed by the engineer from Alstom, who identified 74 requirements as ambiguous.
RCM achieved the highest precision in our analysis, 0.68. The rest of the tools achieved similar
precision between each other. ARM and QuARS precision are 0.43, while RETA’s precision is
0.41. RETA also achieved the highest recall, 0.98, by identifying 177 requirements as potentially
ambiguous. At first, this looks like significant results, since in literature recall is considered
more important [21]. But it is essential to mention that this result was expected because RETA
identifies plural words and passive sentences as an issue that could cause ambiguity. The second
Table 1
Ambiguity issues identified by each tool(“-“ stands for not supported by the tool)
                                   Category ARM QuARS RETA RCM
                          Option/Optionality         7         0       -        7
                    Weak Phrase/Weakness           10         18       -       10
                                 Subjectivity        -         5       -        2
                     Vagueness/Vague Term            -        42      56       38
                                  Quantifier         -         -      42        -
                                     Pronoun         -         -      52        -
                                       Plural        -         -     301        -
                                      Passive        -         -     159        -
                                    Complex          -         -      66        -
                                      Adverb         -         -      51        -
                                     And/OR          -         -     154        -
                                    Implicity        -        21       -        -
                                 Multiplicity        -       116       -        -
                          Underspecificaiton         -        32       -        -

highest recall was achieved by QuARS 0.85, RCM follows it with 0.38, and ARM tool with only
0.08.
   From the results, we can conclude that it’s hard to achieve high precision in ambiguity
detection using solutions based on pattern detection. Even though it is possible to achieve
a very high recall with an extensive set of patterns, it will also result in many false-positive
results. Similar results were also achieved in [7]. RCM tool achieved the highest precision, and
this could be expected because it was developed inside Alstom.

Table 2
Summary of tool’s analysis
                    True Positive     False Positive   False Negative   Precision    Recall
            ARM                 6                  8               68        0.43      0.08
         QuARS                 63                 85               11        0.43      0.85
           RETA                73               103                 1        0.41      0.98
           RCM                 28                 13               46        0.68      0.38


6. Threats to validity
In this section, we present the threats that could affect the validity of our results. According
to [22], the threats to validity can be categorized as construct validity, internal validity, external
validity, and reliability.
   Construct validity refers to the extent to which studied operational measures represent what
the researchers intended to study. The construct validity of our results is the number of tools
selected for this paper. As mentioned in Section 3 many tools are not publicly available. To
tackle potential threats, we’ve included all relevant tools that, to the best of our knowledge,
were publicly available.
   External validity refers to what extent results from the study can be generalized. The presented
findings in this study are obtained from the analysis of system requirements used to develop an
electric train propulsion system. The dataset we used for evaluation contains 180 requirements,
which might not be enough to evaluate the general usage of selected tools. We do not claim
that these results can be generalized in other contexts. However, we believe that these findings
could be helpful for the further evaluation or usage of these tools or to gain insights for future
tool development.
   Internal validity refers to what extent relations between investigated factors can affect the
credibility of results. Assessing whether something is ambiguous or not is a subjective activity,
and it depends on the analyst’s experience and interpretation. Also, to determine the level
of ambiguity, it is more common to analyze and compare answers from multiple analysts.
To mitigate internal validity, we asked an expert in requirements engineering with years of
experience working in the railway industry to analyze requirements and establish ground truth.
   Reliability refers to what extent the data and the analysis depend on the researchers who
performed the study. We asked an expert to analyze the system requirements to mitigate this
threat. The expert was not included in the design of the study, or in performing the calculations
of the results. Also, the tools were used after the expert finished the analysis.


7. Conclusions and Future work
We analyzed four different requirement analysis tools using 180 system requirements from a
larger project in train propulsion design. The tools in the study all use a pattern-based approach
to identify ambiguities in the requirements. Our analysis showed that it is possible to achieve
high recall by following this approach, but increased precision is challenging. The RCM tool,
developed inside Alstom, had the highest precision in our analysis. In comparison, the RETA
tool achieved the highest recall.
   A possible explanation is that domain knowledge plays an important role in identifying
ambiguities. To the best of our knowledge, no papers have investigated how domain knowledge
affects ambiguity identification. Therefore, how domain knowledge and experience affect
identifying ambiguous requirements is the question that we would like to examine in the future.
We plan to survey industry practitioners and academics to explore these questions. This kind of
study would help us better understand the problem of ambiguity and what kind of ambiguity
we should focus our research on in the future. We also aim to analyze other approaches and
tools and identify what techniques can achieve the highest precision and a low number of
false-positive cases in our context.


8. Acknowledgments
This research work has received funding from the ECSEL Joint Undertaking (JU) under grant
agreement No 101007350. The JU receives support from the European Union’s Horizon 2020
research and innovation programme and Sweden, Austria, Czech Republic, Finland, France,
Italy, Spain.


References
 [1] Infectra,    Principles of requirements engineering or requirements man-
     agement       101,     2022.  URL:    https://www.inflectra.com/ideas/whitepaper/
     principles-of-requirements-engineering.aspx.
 [2] B. Rosadini, A. Ferrari, G. Gori, A. Fantechi, S. Gnesi, I. Trotta, S. Bacherini, Using
     NLP to detect requirements defects: An industrial experience in the railway domain, in:
     International Working Conference on Requirements Engineering: Foundation for Software
     Quality, Springer, 2017, pp. 344–360.
 [3] M. Kassab, C. Neill, P. Laplante, State of practice in requirements engineering: contempo-
     rary data, Innovations in Systems and Software Engineering 10 (2014) 235–241.
 [4] D. M. Fernández, S. Wagner, M. Kalinowski, M. Felderer, P. Mafra, A. Vetrò, T. Conte, M.-T.
     Christiansson, D. Greer, C. Lassenius, et al., Naming the pain in requirements engineering,
     Empirical software engineering 22 (2017) 2298–2338.
 [5] C. Arora, M. Sabetzadeh, L. Briand, F. Zimmer, Automated checking of conformance to
     requirements templates using natural language processing, IEEE transactions on Software
     Engineering 41 (2015) 944–968.
 [6] B. Gleich, O. Creighton, L. Kof, Ambiguity detection: Towards a tool explaining ambiguity
     sources, in: International Working Conference on Requirements Engineering: Foundation
     for Software Quality, Springer, 2010, pp. 218–232.
 [7] S. F. Tjong, D. M. Berry, The design of SREE—a prototype potential ambiguity finder for
     requirements specifications and lessons learned, in: International Working Conference on
     Requirements Engineering: Foundation for Software Quality, Springer, 2013, pp. 80–95.
 [8] N. Carlson, P. Laplante, The NASA automated requirements measurement tool: a recon-
     struction, Innovations in Systems and Software Engineering 10 (2014) 77–91.
 [9] G. Lami, S. Gnesi, F. Fabbrini, M. Fusani, G. Trentanni, An automatic tool for the analysis of
     natural language requirements, Informe técnico, CNR Information Science and Technology
     Institute, Pisa, Italia, Setiembre (2004).
[10] K. Rajković, Measuring the complexity of natural language requirements in industrial
     control systems, 2019.
[11] L. Zhao, W. Alhoshan, A. Ferrari, K. J. Letsholo, M. Ajagbe, E.-V. Chioasca, R. T. Batista-
     Navarro, Natural language processing (NLP) for requirements engineering (RE): A system-
     atic mapping study, ACM Computing Surveys (2020).
[12] A. Brooks, L. Krebs, B. Paulsen, Beta-testing a requirements analysis tool, ACM SIGSOFT
     Software Engineering Notes 39 (2014) 1–6.
[13] M. Arrabito, A. Fantechi, S. Gnesi, L. Semini, An experience with the application of
     three NLP tools for the analysis of natural language requirements, in: International
     Conference on the Quality of Information and Communications Technology, Springer,
     2020, pp. 488–498.
[14] M. Q. Riaz, W. H. Butt, S. Rehman, Automatic detection of ambiguous software require-
     ments: An insight, in: 2019 5th International Conference on Information Management
     (ICIM), IEEE, 2019, pp. 1–6.
[15] K. H. Oo, A. Nordin, A. R. Ismail, S. Sulaiman, An analysis of ambiguity detection techniques
     for software requirements specification (SRS), International Journal of Engineering &
     Technology 7 (2018) 501–505.
[16] A. Yadav, A. Patel, M. Shah, A comprehensive review on resolving ambiguities in natural
     language processing, AI Open 2 (2021) 85–92.
[17] U. S. Shah, D. C. Jinwala, Resolving ambiguities in natural language software requirements:
     a comprehensive survey, ACM SIGSOFT Software Engineering Notes 40 (2015) 1–7.
[18] C. Rupp, K. Pohl, Requirements engineering fundamentals, Rocky Nook (2011).
[19] A. Mavin, P. Wilkinson, A. Harwood, M. Novak, Easy approach to requirements syntax
     (EARS), in: 2009 17th IEEE International Requirements Engineering Conference, IEEE,
     2009, pp. 317–322.
[20] D. M. Berry, E. Kamsties, M. M. Krieger, From contract drafting to software specification:
     Linguistic sources of ambiguity, 2003. URL: http://se.uwaterloo.ca/~dberry/handbook/
     ambiguityHandbook.pdf.
[21] D. M. Berry, Empirical evaluation of tools for hairy requirements engineering tasks,
     Empirical Software Engineering 26 (2021) 1–77.
[22] P. Runeson, M. Höst, Guidelines for conducting and reporting case study research in
     software engineering, Empirical software engineering 14 (2009) 131–164.