<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ORKS for CLASSIFICATION of Natural Language-based NON-FUNCTIONAL REQUIREMENTS</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="editor">
          <string-name>Requirements Engineering, Machine Learning, Natural Language Processing</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer and Information Sciences, Towson University</institution>
          ,
          <addr-line>Towson, MD 21252</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>In: F.B. Aydemir</institution>
          ,
          <addr-line>C. Gralha, S. Abualhaija, T. Breaux, M. Daneva, N. Ernst, A. Ferrari, X. Franch, S. Ghanavati, E. Groen</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Proceedings of REFSQ-2021 Workshops, OpenRE</institution>
          ,
          <addr-line>Posters and Tools Track, and Doctoral Symposium, Essen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>R. Guizzardi</institution>
          ,
          <addr-line>J. Guo, A. Herrmann, J. Horkof, P. Mennig, E. Paja, A. Perini, N. Seyf, A. Susi, A. Vogelsang (eds.): Joint</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In software projects, non-functional software requirements (NFRs) are critical because they specify system quality and constraints. As NFRs are in natural language, accurately analyzing NFRs requires domain knowledge, expertise, and significant human eforts. Automated approaches that can help identify and classify NFRs can lead to reduced ambiguity and misunderstanding among software engineers, decreasing developmental costs and increasing software quality. This paper investigates the efectiveness of leveraging machine learning techniques to automatically classify various types of NFRs. Specifically, we develop and train a recurrent neural network model, which has been evaluated to be efective in handling sequential natural language text, to classify natural language NFRs into five diferent categories: maintainability, operability, performance, security, and usability. We evaluate and detail insights from the experimental study performed on two data sets that contain almost 1,000 NFRs. The experimental results show that this approach can classify NFRs with a precision average of 84%, recall of 85%, F1-score near 84%, and classification accuracy of 88% on the testing data set. As indicated by the results, applying appropriate machine learning techniques can help reduce manual eforts, eliminate human mistakes, facilitate the software requirements analysis process, and lessen developmental costs.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>REQUIREMENTS</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Requirements Engineering (RE) is an essential phase in the software development process [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Software requirements are broadly classified into two main categories, functional requirements
(FRs) and non-functional requirements (NFRs). NFRs define a software system’s constraints and
quality expectations concerning the underlying intricacies involved in integrating disparate
subsystems and diferent architectural layers. A few of the almost 150 categories of NFRs
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] are maintainability, operability, performance, usability, and security. Understanding and
implementing NFRs are critical to a software project. However, many developers overlook the
importance of NFRs [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], spend less attention on NFRs due to financial and technical burden on
the organization [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], or postpone implementation of NFRs until late in the development process
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. All of these significantly reduce the overall quality of software and lead to additional cost
and eforts, and often result in frequent maintenance changes [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        NFRs are often subjective and are dependent upon each organization’s infrastructure
capabilities. Consequently, the stakeholder’s quality concerns that are encapsulated in NFRs become
disorganized in the overall RE process, resulting in aspects related to NFRs being scattered
across specification artifacts [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In addition, due to the fact that domain experts may not be
available during the early RE phase, many NFRs are identified in the later phases of the software
development process or not explicitly managed. Having engineers and experts go through and
identify every NFR buried in large requirement documents is very tedious, time-consuming
and error-prone. This often results in dificulties in isolating an NFR’s functionality-related
information and is also detrimental in developing an holistic understanding about them.
Consequently, NFRs not only need to be identified, but also need to be consolidated for efective
sense-making and implementation. Such consolidation often requires classifying NFR
descriptions into pre-existing taxonomy. However, such classification is not a trivial task, as quality
concerns are often expressed in natural language. Understanding them becomes a subjective
process that depends upon one’s interpretation of the language, which hinges on education,
domain knowledge, expertise, etc. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Due to these constraints, manual NFR classification is a
labor intensive process and has potential human errors that can prove costly. Chung et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
identified that natural language NFRs further conflict with each other in meaning and are often
interconnected, making the manual classification process cumbersome. Automating this task
can, therefore, increase the eficiency and efectiveness of the RE process. Machine-learning
models for classification such as supervised neural networks ofer an avenue for exploration.
      </p>
      <p>This paper develops multi-class NFR classification using a type of Recurrent Neural Network
(RNN), the Long Short-Term Memory (LSTM) model. Our best performing LSTM model achieves
average precision, recall, and F1 score, across all target classes, for a testing data set at 84%, 85%,
and 84%. The accuracy for LSTM on unseen testing data is 88%. Specifically, the contributions
of this paper are: 1) The development of an approach using a LSTM to automatically classify
NFRs; 2) An experimental study of the approach with widely used NFR data sets and report
comparisons to other existing approaches; and, 3) An automated NFR analysis process to reduce
manual eforts, human mistakes, and potentially increase software quality.</p>
      <p>The rest of this paper is organized as follows. Section 2 presents related work. Section 3
introduces the approach designed in our study. Section 4 describes the experimental study used
to evaluate our approach and analyzes the results. Section 5 identifies threats to validity and
Section 6 concludes the paper and suggests future work.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>
        Categorization of NFRs have been carried out by several researchers over the past years. A
comprehensive categorization of NFRs was performed by Chung et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] where almost 150
categories of NFRs were identified. Roman [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] introduced 6 classes of NFRs as economic,
interface, life-cycle, operating, performance, and political. Slankas and Williams [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] described
14 categories of NFRs. Based on these results, it is evident that the classification of NFRs depends
upon the domain expertise of software engineers performing the task. The ambivalent nature of
this manual classification introduces errors thereby reducing the quality of software. This paper
proposes an automated approach to classification of NFRs that can help mitigate these issues.
      </p>
      <p>
        Several attempts have been made to classify NFRs automatically. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Cleland-Huang et al.
used information retrieval techniques to identify ‘indicator terms’ from NFRs for classification.
Lu and Liang [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] attempted to classify the NFRs from the user application reviews using machine
learning algorithms, including Naïve Bayes, Bagging, and a combined classification technique
consisting of Bag of Words (BoW). Dekhtyar and Fong implemented a Convolutional Neural
Network (CNN) model using Word2vec pre-trained algorithm for NFRs [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], however they were
only able to perform binary classification and did not classify NFRs into diferent categories.
      </p>
      <p>
        In prior work [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], we experimented with automated classification of NFRs using Artificial
Neural Network (ANN) and CNN models. Also using the Support Vector Machine, [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] developed
a supervised learning approach based on meta-data, lexical, and syntactical features. Their
approach produced good results for binary classification with manually selected features but
did not perform well with multi-class classification of NFRs. One of the drawbacks identified
with traditional feed-forward deep learning neural networks, such as CNNs, is their inability to
preserve representations from the previous input data for use in their learning process. This
can lead to issues in classification problems with sequenced text, where there are significant
relationships between a set of statements. While CNNs have been successful in handling
computer vision data, RNNs have become the de-facto method for modeling text and audio data.
This makes RNN as a potentially better candidate for NFR classification.
      </p>
      <p>
        RNNs are a unique type of deep learning methods which are well suited for processing
variable length sequences of input data. In this study, we aim to compare the results of our RNN
LSTM with the CNN results from prior work and provide our findings. Prior to our study, there
is very little research on NFR classification with RNN. In [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], Rahman et al. proposed a similar
approach to the NFR classification problem by experimenting with the RNN architecture’s
LSTM and Gated Recurrent Unit (GRU) variants. However, a potential issue with Rahman et
al. is that their data set has 370 NFRs, which is quite small to complex neural networks, and
may not result in reliably trained models. Our research rectifies this issue with the data set, by
using a combination of two large public data sets containing almost 1,000 NFRs. In addition,
the approach of Rahman et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] used manually selected test data in the experiment without
introducing how the test data were selected.
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. Approach</title>
      <p>This section explains the approach followed in classifying NFRs using RNNs. RNNs were initially
developed to create recurrent links between networks which dynamically stored memory about
prior representations of data by passing back internal states from hidden layers of previous steps
to current steps, thereby allowing the networks to contextualize the training data for better
results. The feedback loops in RNNs allow for information to persist. A traditional RNN model
works better if the number of words in a sequence are nominal and loses more information
if the number of words in the sequential input data increases. This is because of the weights
assigned to a word that appears earlier in the time-series decreases since new words in the
sequence gain more weights as the model moves forward. This impacts the unique contextual
learning behavior of the RNN model particularly if the words from further back on the current
time-series are required to connect the information for classification.</p>
      <p>
        Specifically, our research employs a particular form of RNN - the LSTM model. The LSTM
structure is slightly complex within the recurring neural networks where there are four layers
controlling the vector flow. In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], Olah explains in simple words the architecture of the LSTM
specific recurrent neural networks. The following steps are followed to construct the LSTM
RNN model in our study: 1). Data set Pre-processing; 2). Word Vectorization Process; and, 3).
RNN Model Construction. These steps are described next.
      </p>
      <p>The first step requires pre-processing input text features of the NFR data set. The NFR data
set has two features - an NFR requirement and the pre-labeled target class. The pre-processing
involved five sub-steps: a). Conversion of the text to all lowercase to provide consistent input to
the model and to remove additional noise; b). Replacing of certain mathematical symbols with
spaces; c). Removal of special characters, using regular expressions; d). Removal of common
stopwords using the nltk natural language processing library; and, e). Removal of stopwords
unique to the data set. The second step is the tokenization and word embedding. As the deep
learning model understands only numeric inputs, the text has to be broken down into “tokens,”
through “tokenization.” We tokenize the NFR data using the Keras Tokenizer function1.</p>
      <p>Another important aspect is the word embedding process. With the RNN model, where
context of the words is important, one-hot encoding itself is not suficient. To preserve relationships
between words and to retain context, we perform a process called word embedding. The word
embedding are added as a set of weight parameters that indicates the context, similarity and
closeness between the words. We experimented withWord2vec2 pretrained model embedding.
Word2vec is an unsupervised model that uses Google’s pretrained word embedding algorithm
where the model is trained on over 100 billion words. It seeks to embed words such that words
often found in similar context are located near one another in the embedding space.</p>
      <p>The final step is the RNN LSTM model construction. Figure 1 provides a sequential flow
diagram of the construction of the RNN LSTM. As RNNs are recurrent in nature, to create a
model which uses this architecture, we use TensorFlow Keras API’s Sequential model function
to create a linear stack of sequential layers, i.e., input layer, hidden layer and output layer. The
ifrst input layer prepares word embedding. Next, a spatial dropout layer is added to sustain
the meaning of all the words. Then the LSTM layer is added with nodes set to 100. After the
1Keras Tokenizer - https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text/Tokenizer
2Word2vec oficial link - https://code.google.com/archive/p/word2vec/
LSTM layer is created, the final output layer is constructed an activation function chosen to be
“softmax.”</p>
    </sec>
    <sec id="sec-5">
      <title>4. Experimental Evaluation and Results</title>
      <sec id="sec-5-1">
        <title>4.1. Data Set</title>
        <p>We use two large, publicly available data sets with pre-labeled NFRs: The International
Requirements Engineering Conference’s 2017 Data Challenge data set3 and the Predictor Models in
Software Engineering Data set (PROMISE) 4. Both data sets contain NFRs and each requirement’s
class so we combined the two data sets, resulting in 1,165 unique NFRs falling into 14 categories.
By combining the two data sets, we aim to address poor data availability issues raised in prior
studies and also to eliminate possible biases that may exist in the data sets. Since 9 of the 14
categories have a small number of entries, only 5 categories are used in this study, which results
in a total of 914 NFRs consisting of the following categories: maintainability (137), operability
(153), performance (113), security (354), and usability (157).</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Experiment Setup</title>
        <p>For setting up the experiment, we compile the model using the Keras Compile5 function by
passing parameters with a specific loss function known as “categorical_crossentropy,” which is
generally used in multi-class classification problems and an optimizer function called “adam,”
which is derived from “adaptive moment estimation.” To study the performance of the model,
we include a set of commonly used metrics – precision (P), recall ( R ), F1-score (F1) and area
under the curve (AUC) into the compile function to produce custom results for these metrics in
addition to the accuracy metrics. After the model is constructed, the next step in the process is
to train and test the model.</p>
        <p>We split the data set into a 90% training and validation set and a 10% testing data set. The
model is trained and cross-validated using the 90% data set. To make the model perform
efectively, we train the model through cross-validation with “k” number of folds. This allows
the model training to weigh each instance equally so that over represented classes don’t get too
much weight. Specifically, we performed a stratified  -fold cross-validation on the training data
set.</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Results Analysis</title>
        <p>The experiment includes tuning diferent hyper-parameters and other features. Table 1 lists the
results using Word2vec embedding and shows values for the metrics obtained as a result of the
 -fold cross-validation process and the testing on unseen test data. The  -fold cross validation
results show the average results and standard deviation values of these metrics from the 10-fold
training. The testing results show the results obtained by running the cross validated model on
3http://ctp.di.fct.unl.pt/RE2017/pages/submission/data_papers/
4http://promise.site.uottawa.ca/SERepository
5Keras Metrics and Compile function - https://keras.io/api/metrics/
the test data set. The last column in Table 1 lists the classification accuracy calculated by the
model evaluation process. For hyper-parameters, we configure the batch size to be 10, 30 and 60,
embedding layer dimensions set at 300, epoch to be 50, LSTM layer nodes at 100. We highlight
the results with highest classification accuracy associated each hyper-parameter tuning. The
table also shows the  -fold cross-validation results for the standard deviation values for the
precision, recall and F1-score averages from each run. The results show that the RNN word2vec
model with batch size=30 achieves the highest accuracy at 88%.</p>
        <p>Using the best performing model with the best hyper-parameters configuration, five individual
tests were performed. Table 2 and 3 shows the results from five independent test runs of this
highest performing Word2vec model with same data and tuning parameters to avoid any
accidental stochastic errors. The tables list the values of metrics being evaluated: precision,
recall, F1-score, and AUC for each of the 5 target NFR classes from these 5 independent test
runs. These values were calculated for each class by analyzing the confusion matrix for each
and identifying the true positives, true negatives, false positives, and false negatives.</p>
        <p>The results show the performance metrics of each of the target NFR classes. Overall, the
third and fourth test runs resulted in better classification accuracy, which is calculated by the
model based on the true predictions on the test data. However, in imbalanced data sets, the
classification accuracy calculated by the Keras in-built model evaluation 6 process may not
provide clear results. Thus, looking at the individual precision, recall, and F1-score values by
each class provides a better understanding of the results. Table 3 shows the average values of
each metric calculated across all classes and the classification accuracy reported by the model’s
evaluation process. The table shows the average precision values range between 81%-87%, recall
between 80%-87% with an F1-score between 80%-87% across these tests. The best classification
accuracy was 88% found in Test Run # 3 and 4. Test Run #1 had the best average precision and
recall values. In Test run # 2, Operability class shows a 100% recall with only 8 test entries. In
Test runs # 2, 4 and 5, Performance class shows a 100% precision and recall with only 12 entries,
respectively, which indicates that the model performs well with imbalanced classes.</p>
        <p>Table 4 shows a comparison of the average metrics results and classification accuracy between
our testing data and prior RNN work’s testing data results. These results demonstrate that
our RNN LSTM model performed better on all performance metrics with average increased
classification accuracy of at least 15% than the reported results. This indicates that our model
produced consistent results between independent test runs. A similar inference can be made for
6Keras model training - https://keras.io/api/models/model_training_apis/
other metrics with our model’s results producing better results across diferent test runs. An
advantage of the LSTM is to retain needed memory for longer period in a sequential process.
However this can also be detrimental in some cases. For example, some of the NFRs have
similar word sequences - “100% of the cardmember and merchant services representatives shall use
the Disputes application regularly after a 2-day training course ” and “The Disputes applications
shall interface with the Merchant Information Database. The Merchant Information Database
provides detailed information with regard to the merchant. All merchant detail information shall be
obtained from the Merchant Information”. However, the latter NFR was labeled as “Operability”
originally in the data set. When the RNN LSTM model learned the words from this NFR during
the training process, it stores in its memory that a certain combination of these words falls
under “Operability” class. Consequently, during the testing process, when the former NFR
appeared, the model had assigned scores as a split between the two classes. Due to this inherent
sequential memory storage, the model probably misclassified this NFR.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Threats to Validity</title>
      <p>The RNN LSTM model trained and tested in this work faces threats rising from current
probabilistic mechanisms in dealing with multi-category classification problems. As neural network
models usually output real numbers as values for their classes, a transformation layer, such
as the “softmax” function at the end of the model, converts them into values as probabilities
between 0 and 1. NumPy library’s argmax identifies the class with the highest probability
and assigns it as the observation’s target class. However such an assignment based on highest
probability has an issue. Post-hoc manual analysis of the results indicate the reason behind this
issue. Manual verification showed us that 11 out of the 92 tested NFRs got mis-classified by the
model. For 7 of the 11 NFRs, the second highest probability predicted by the model belonged to
a class originally labeled in the data set. Of these, in 4 of the 7 NFRs the predicted classes had
highest probability scores of less than or equal to 50%, and the second highest probabilities were
numerically very close. This suggests that classification based on highest probability without
use of weights or confidence intervals pose an internal threat to validity in such multi-category
classification problems.</p>
      <p>
        One important external threat to the validity of the results from this study is with the labeling
of the data set. As the NFRs in our data set are labeled manually, it is possible that the labeling
process may not be accurate. Instances such as mislabeling, missing information, unavailability
of required information could lead to the model misinterpreting the words from the beginning.
One of the NFR statements in the data set was - “Data is validated for type, length, format, and
range”. This was labeled as “Security” class. However it is not very clear why this NFR was
labeled as “Security”. Such subjective ambiguity in labeling NFR has been noted in previous
research [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] . In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], it was reported that five security experts individually could not identify
more than 50% of the security requirements with the information provided in the data set.
Such issues reduce confidence in the accuracy of a data set and can potentially impact the
performance of a neural net model.
      </p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion and Future Work</title>
      <p>
        This paper introduces an LSTM-based approach to automatically classify NFRs and the results
from our RNN LSTM model are better than prior existing works (e.g., [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11, 12, 13</xref>
        ]). Even with
these better results, we still face threats due to the behavioral nature of the RNN LSTM’s learning
process. Part of this is also due to the fact that the model takes each word by word as input to
learn the context. To overcome this issue, we will perform research experiments on advanced
natural language processing methods using popular pre-trained language representation models,
such as Bidirectional Encoder Representations from Transformers (BERT) 7 . One advantage of
using this model is the ability to process embedding at a sentence level rather than a word level,
thereby treating words in the sentences as “WordPieces.” In addition, as this study focused only
on categories of NFRs, in future work we will include FRs and assess the model’s performance.
Source code for this project can be found on GitHub 8.
      </p>
      <p>7https://huggingface.co/transformers/model_doc/bert.html
8https://github.com/rgnana1/NFR_Classification_RNN_LSTM</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N. K.</given-names>
            <surname>Sethia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Pillai</surname>
          </string-name>
          ,
          <article-title>The efects of requirements elicitation issues on software project performance: An empirical analysis</article-title>
          , in: C.
          <string-name>
            <surname>Salinesi</surname>
          </string-name>
          , I. van de Weerd (Eds.), Requirements Engineering: Foundation for Software Quality, Springer,
          <year>2014</year>
          , pp.
          <fpage>285</fpage>
          -
          <lpage>300</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Nixon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mylopoulos</surname>
          </string-name>
          ,
          <article-title>Non-functional requirements in software engineering</article-title>
          , volume
          <volume>5</volume>
          ,
          <string-name>
            <surname>Springer</surname>
            <given-names>Science</given-names>
          </string-name>
          &amp; Business
          <string-name>
            <surname>Media</surname>
          </string-name>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Maiti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Mitropoulos</surname>
          </string-name>
          ,
          <article-title>Capturing, eliciting, predicting and prioritizing (CEPP) nonfunctional requirements metadata during the early stages of agile software development</article-title>
          ,
          <source>in: SoutheastCon</source>
          <year>2015</year>
          ,
          <year>2015</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
          <source>doi:1 0 . 1 1</source>
          <volume>0</volume>
          <fpage>9</fpage>
          <string-name>
            <surname>/ S E C O</surname>
          </string-name>
          <article-title>N</article-title>
          .
          <volume>2 0 1 5 . 7 1 3 3 0 0 7 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Younas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. N. A.</given-names>
            <surname>Jawawi</surname>
          </string-name>
          , I. Ghani,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>Extraction of non-functional requirement using semantic similarity distance</article-title>
          ,
          <source>Neural Computing and Applications</source>
          <volume>32</volume>
          (
          <year>2020</year>
          )
          <fpage>7383</fpage>
          -
          <lpage>7397</lpage>
          .
          <source>doi:1 0 . 1 0 0 7 / s 0 0</source>
          <volume>5 2 1 - 0 1 9 - 0 4 2 2 6 - 5</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cleland-Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Settimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zou</surname>
          </string-name>
          , P. Solc,
          <article-title>Automated classification of non-functional requirements</article-title>
          , in: Requirements Engineering, volume
          <volume>12</volume>
          , Springer-Verlag New York, Inc.,
          <year>2007</year>
          , pp.
          <fpage>103</fpage>
          -
          <lpage>120</lpage>
          .
          <source>doi:1 0 . 1 0 0 7 / s 0 0</source>
          <volume>7 6 6 - 0 0 7 - 0 0 4 5 - 1</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R. Berntsson</given-names>
            <surname>Svensson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gorschek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Regnell</surname>
          </string-name>
          ,
          <article-title>Quality Requirements in Practice: An Interview Study in Requirements Engineering for Embedded Systems</article-title>
          , in: M.
          <string-name>
            <surname>Glinz</surname>
          </string-name>
          , P. Heymans (Eds.), Requirements Engineering: Foundation for Software Quality, Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2009</year>
          , pp.
          <fpage>218</fpage>
          -
          <lpage>232</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Roman</surname>
          </string-name>
          ,
          <article-title>A taxonomy of current issues in requirements engineering</article-title>
          ,
          <source>Computer</source>
          <volume>18</volume>
          (
          <year>1985</year>
          )
          <fpage>14</fpage>
          -
          <lpage>23</lpage>
          .
          <source>doi:1 0 . 1 1 0 9 / M C . 1 9</source>
          <volume>8 5 . 1 6 6 2 8 6 1 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Slankas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <article-title>Automated extraction of non-functional requirements in available documentation</article-title>
          ,
          <source>in: Proc. 1st International Workshop on Natural Language Analysis in Software Engineering</source>
          , IEEE,
          <year>2013</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <article-title>Automatic Classification of Non-Functional Requirements from Augmented App User Reviews</article-title>
          ,
          <source>Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering - EASE'17</source>
          (
          <year>2017</year>
          )
          <fpage>344</fpage>
          -
          <lpage>353</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dekhtyar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Fong</surname>
          </string-name>
          ,
          <article-title>RE Data Challenge: Requirements Identification with Word2Vec and TensorFlow, 2017</article-title>
          .
          <source>doi:1 0 . 1 1 0 9 / R E . 2 0</source>
          <volume>1 7 . 2</volume>
          <fpage>6</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dehlinger</surname>
          </string-name>
          ,
          <article-title>Automatic multi-class non-functional software requirements classification using neural networks</article-title>
          ,
          <source>in: Proceedings - International Computer Software and Applications Conference</source>
          , volume
          <volume>2</volume>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Kurtanovic</surname>
          </string-name>
          , W. Maalej,
          <article-title>Automatically Classifying Functional and Non-functional Requirements Using Supervised Machine Learning</article-title>
          ,
          <source>in: Proc. IEEE 25th International Requirements Engineering Conference</source>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>490</fpage>
          -
          <lpage>495</lpage>
          .
          <source>doi:1 0 . 1 1 0 9 / R E . 2 0</source>
          <volume>1 7 . 8</volume>
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Haque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. N. A.</given-names>
            <surname>Tawhid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Siddik</surname>
          </string-name>
          ,
          <article-title>Classifying Non-Functional Requirements Using RNN Variants for Quality Software Development</article-title>
          ,
          <source>in: Proc.eedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>25</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Understanding</surname>
            <given-names>LSTM</given-names>
          </string-name>
          <article-title>Networks - colah's blog, 2015</article-title>
          . URL: http://colah.github.io/posts/ 2015-08-Understanding-LSTMs/.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <source>[15] 25th IEEE International Requirements Engineering Conference</source>
          ,
          <year>2017</year>
          . URL: http://ctp.di. fct.unl.pt/RE2017/pages/submission/data_papers/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>