<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Open Challenges in NLP for NFRs: A Focus on Semantics, Generalization, and Interpretability</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>RrezartaKrasniq</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Requirements Engineering, Non-Functional Requirements, Semantic Soundness, Generalizability, Interpretability</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Software and Information Systems, University of North Carolina at Charlotte</institution>
          ,
          <addr-line>Charlotte, NC</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>In: Muhammad Abbas, Fatma Başak Aydemir</institution>
          ,
          <addr-line>Maya Daneva, Renata Guizzardi, Jens Gulden, Andrea Herrmann, Jennifer Horkof</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Leveraging natural language processing (NLP) models within the non-functional requirements (NFR) domain has proven highly efective in addressing various issues, including automated traceability, classification of NFR compliance documents, NFR prioritization, among others. Despite these significant advancements, there remain open challenges associated with the full integration of NLP models in the NFR domain. For example, using NLP models to capture the semantics of complex phrases present in safety-critical NFRs must ensure that they do not lead to misinterpretations and potential safety risks. Therefore, this paper focuses on three key challenges related to semantic soundness, ontology generalizability, and the interpretability of model outcomes. These challenges have been chosen for several reasons. First, the absence of semantic precision can result in the misinterpretation of NFRs. Second, given that NFRs cover diverse domains, NLP models must generalize across these domains. Lastly, many problems within the NFR domain rely on decision-making based on predictions from NLP models. However, frequently adapted traditional NLP models such as ensemble models or kernel models are often regarded as 'black-boxes,' with output predictions that are challenging to interpret. Guided by these insights, we present a roadmap agenda through 10 implicit system-based scenarios drawn from the NFR perspective. These scenarios illustrate gaps where these NLP challenges become evident within the NFR domain. Additionally, we suggest solutions, strategies, and alternative approaches to better address these NLP challenges.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the early stages of requirements engineering, descriptions of requirements specifications often take
an informal approach1[]. Typically, these requirements are expressed in natural lang2u].aHgeow[ever,
written requirements are inherently ambiguous, inconsistent and lack st3r]u. cTthuirseis[sue becomes
more evident when dealing with non-functional requirements (NFRs). For example, consider the NFR
specification: ‘The product shall retrieve query results in a reasonable time.’ is vague and fails to
describe what ‘reasonable time’ means. Existing traditional natural language processing (NLP) models
are unable to fully draw contextual semantic inferences from such texts. Hence, this brings another
issue in perspective, the need for building more refined ontologies suitable to NFRs. While various
NLP-based techniques attempt to address this issue, they have concurrently become more challenging to
interpret due to their complex underlying des4ig].nD[ue to their dense internal model representations,
they have almost become unusable among practition5e,r6s][. Their internal calculations that lead
to predictive outputs often resemble black-boxe7s],[raising concerns about their interpretability. In
light of these developments, requirements analysts must exercise caution to ensure that the output
predictions generated by these models align with human reasoning. Consequently, a sole reliance on
NLP predictive models can carry consequences within the NFR domain that base decisions upon those
predictions. We reason that NLP models should built upon strong attributes of transparency and explain
how they arrived at those output predictions. Because this is such a major challenge nowadays, the
need for human-in-the-loop is a necessity rather than a ch8o]i.cBea[sed on these observations, the
rrezarta.krasniqi@charlotte(.Re.duKrasniqi)</p>
      <p>CEUR</p>
      <p>ceur-ws.org
following research question is formulated.</p>
      <p>What are the key challenges to leveraging existing NLP models within the NFR domain, with
respect tosemantic soundnes,sontology generalizabili,tyand output interpretabilit?y
While other challenges within NLP models, such as eficiency and scalability probl9e, m10s][, do exist,
they are perceived as less critical compared to the lack of semantic soundness, ontology generalizability,
and interpretability that such models entail if they were to be used within NFR1d1o,m12a]i.nW[ e
reason that the lack of semantic soundness in NLP models can lead to misinterpretations of complex
phrases, such as those found in regulatory documents for safety-critical systems. Moreover, NFRs
span across diverse domains, each characterized by domain-specific terminologies. In such cases, NLP
models need to expand domain boundaries. hence, the lack of ontology generalizability can diminish the
applicability of NLP models with the NFR domai1n3][. Furthermore, because NFRs are so heterogeneous
and exceed domain boundaries, they rely on predictions derived from traditional NLP models such as
ensemble and/or VSM models on their final decision-makings. However, these models are often regarded
as ‘black-boxes’ with output predictions that are challenging to interpret. Their lack of interpretability
can prevent both practitioners and researchers from using them due to potential reliability implications.
We explore 10 implicit, system-based scenarios (from an NFR perspective) illustrating key challenges,
and then provide recommendations and solutions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>NLP domain has advanced both from the technical scope and applicability in multiple disciplinary
areas such as medicine, finance, marketing, education, machine translation, text summarization, speech
recognition, and chatbots among others. Despite of its systematic progress, the unresolved issues
of semantic soundness, ontology generalizability, and interpretability pose a substantial risk to its
full potential for solving problems in the NFR domain. To ensure clarity, we briefly define semantic
soundness, ontology generalizability, and interpretability within the NLP context.</p>
      <sec id="sec-2-1">
        <title>2.1. Semantic Soundness</title>
        <p>Semantic soundness in NLP pertains to the models’ capability to accurately understand and interpret
the meaning of words, phrases, and sentences in natural langu14a]g. eIm[proving NLP semantic
soundness is essential when analyzing intricate safety-critical regulatory documents requiring precise
comprehension. Challenges in semantic soundness frequently arise due to word ambiguity, polysemy,
or inconsistencies stemming from terminological usage. As a result, NLP models lack precision in
discerning the semantic context of words or meaning of complex NFR phrases, that potentially lead to
wrong interpretations and unsatisfactory outcomes.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Ontology Generalizability</title>
        <p>Ontology generalizability pertains to the adaptability of NLP models across various domains and contexts
[15]. From an NLP perspective, these models are frequently tailored to a particular domain or subset of
domains. They provide pre-trained word-embeddings, specific terminologies and/or vocabularies that
may not be applicable in many disciplines, including the conventional NFR knowledge base. As a result,
NLP models often lack the necessary semantic knowledge and understanding to be efectively used
within the NFR domain, highlighting the need for research into more generalizable NLP models for
requirements engineering.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Model Output Interpretability</title>
        <p>Model output interpretability pertains to the extent of transparency in the decision-making processes
computed by NLP models [16]. Within the NFR domain, where NLP models play a crucial role in making
critical decisions based on user inputs, the ability to interpret model output predictions becomes a
human accountability concern. For example, safety standards for self-driving cars are st17r]i.nIgfent [
NLP models predict issues related to sensor data, additional interpretation might be necessary to ensure
the safety of passengers. The lack of interpretability can lead to a lack of trust among stakeholders,
including end-users, architects, developers, and safety operators. This lack of understanding about how
these models arrive at specific outcomes can question human confidence in relying on them.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Open NLP Challenges in the Context of NFRs</title>
      <p>In this section, we turn our focus to three core NLP challenges and their respective shortcomings: (A)
lack of semantic soundness, (B) issues with ontology generalizability, and (C) the level of interpretability
inherent in the models. We present illustrative problem scenarios that span a range of 10 diverse
domains, all with an emphasis on NFRs. These scenarios serve multiple purposes. First, they provide
brief insights into each domain. Second, they bring a user-centric perspective. Finally, they serve as a
bridging point connecting NLP challenges to the broader context of NFRs.</p>
      <sec id="sec-3-1">
        <title>3.1. NLP Semantic Soundness</title>
        <p>We explore and analyze NLP models’ semantic soundness, focusing on two dimensions (1) semantic
similarity and (2) semantic interpretability.
(1) Semantic Similarity—Measuring the semantic similarity of requirements is challenging because
requirements are often written as short sentences, which means they contain implicit information that
can only be understood in the limited context. Typically, implicit information can be viewed by the
requirement analyst in a perceptive way.
• NFR Scenario # [18]: A financial organization is developing a trading platform, and one of the
non-functional requirements is defined a‘Ns FR-1: The system should ensure high security for financial
transactions.’ The description of NFR-1 uses a semantics that can be open to multiple interpretations.
If we closely examine the term ‘high security,’ is open to multiple interpretations, such as encryption
protocols, authentication methods, data confidentiality and integrity, compliance, authorization,
authentication, availability or access controls. Current NLP models will not be able to adequately identify
the security aspects crucial for financial transactions. As a result, the generated solutions might not
align with the stakeholders’ actual expectations.
• NFR Scenario #2 [19]: For example, if we analyz‘eR-1: The product shall preclude personal data
from being printed’ and‘R-2: The system shall grant the user to print the invoice summary.’ Both R-1 and
R-2 share a similar semantic meaning, as both refer to the same task (‘print’). However, in the eyes of
requirement analysts, R-1 and R-2 difer since R-1 conveys security matters and R2 is purely functional.
(2) Semantic Interpretability—While semantic representation models have enhanced on solving
many RE tasks, including NFR ones, a deeper semantic analysis is necessary. A chief problem with most
of such models relies on the usage of distributional semantics. The idea of distributional semantics is
that words that appear in similar contexts tend to have similar meanings. However, inaccurate semantic
relationships can skew NLP similarity scores, leading to unreliable interpretations, especially in NFR
domains that rely heavily on these scores for decision-making.
• NFR Scenario #3 [20]: Consider an NFR scenario in the context of a medical diagnosis system that
uses NLP to interpret patient symptoms and provide accurate diagnoses. One of the non-functional
requirements is defined as:‘NFR-3: The medical diagnosis system should accurately identify rare diseases
based on patient symptoms.’ The NLP model uses distributional semantics to analyze patient records and
identify symptoms associated with rare diseases. However, relying solely on context may not guarantee
the necessary soundness for detecting rare diseases. Certain symptoms may be contextually similar to
those of common diseases, leading to misdiagnoses and potential patient harm.
• NFR Scenario #4 [21]: A software development team is working on an e-commerce platform with a
specific NFR problem pertaining to security‘N:FR-4: The system should ensure robust protection against
SQL injection attacks.’ In this scenario, the NFR-4 involves the phrase ‘SQL injection attacks,’ which
is a critical security conce2r2n].[ Semantic interpretability relies on contextual information from
documents to understand the meaning of words, and it may identify common contexts where the phrase
‘SQL injection’ appears, such as discussions about web application security or external vulnerabilities.
However, in these types of scenarios, NLP models that apply semantic interpretability fail to capture the
depth of knowledge required to interpret ‘SQL injection attacks.’ This lack of semantic soundness can
lead to potential misinterpretations, such as overlooking specific security measures needed to address
SQL injection attacks.
• NFR Scenario #5 [23]: A software development team is working on a new platform with an NFR
requirement:‘NFR-5: The website should have low latency response times during high trafic events,
ensuring smooth user experience.’ In this scenario, the NFR-5 requires the system to handle high trafic
events eficiently, ensuring low latency response times. The team employs existing NLP models to
interpret user feedback and performance reports to identify issues with response times during peak
loads. The NLP model relying on semantic interpretability will not be able to accurately capture the
context and meaning of ‘slow’ during ‘sales.’ Due to the lack of soundness in interpreting the NFR
context, the NLP models will fail to identify the importance of addressing the specific performance issue
during high trafic events. This can lead to subsequent delays in optimizing the website’s performance
during peak loads, resulting in non-satisfactory user experience during critical events.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. NLP Ontology Generalizability</title>
        <p>
          NLP models are often built to fit certain problems and do not apply to a wide spectrum of problems. In
essence, they lack incorporating robust heterogeneous multi-models that meet diverse users’ needs.
These limitations impede the comprehensive extraction of knowledge from various domains, including
the NFR domain. Consequently, the NLP knowledge base lacks compositional aspects both in terms
of linguistic knowledge (e.g., morphological, syntactic, and lexical) and non-linguistic knowledge (i.e.,
pragmatic inference). That said, existing specific ontologies need to expand domain boundaries to be
efectively considered for NFR cross-domain demands.
• NFR Scenario #6 [24]: A healthcare organization is developing a medical diagnosis system to assist
doctors in diagnosing various diseases. One of the non-functional requireme‘NntFRs-i6s: The system
should provide accurate and eficient diagnosis for a wide range of medical conditions.’ Existing NLP
models used in medical diagnosis systems are often text-based and may not always handle other data
modalities and/or capture the full complexity of several medical conditions that potentially a patient
may be sufering. Hence, existing NLP models, will fail to integrate and use information from multiple
sources or heterogeneous environments, subsequently leading to limited ontology generalizability.
• NFR Scenario #7 [
          <xref ref-type="bibr" rid="ref4">25</xref>
          ]: An educational institution is developing an e-learning platform, and one of
the non-functional requirementNsFisR-7: The system should support personalized learning experiences
for students.’ It is evident that NFR-7 requires the e-learning platform to provide personalized learning
experiences, tailoring educational content and resources to each student’s individual needs and learning
styles. However, current NLP models may not fully comprehend the intricacies of various subject
domains and educational contexts, limiting their ability to adapt and cater to the unique learning
preferences of diferent students efectively.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. NLP Model Interpretability</title>
        <p>
          In this section, our focus shifts to interpretability, emphasizing two key aspects: ‘faithfulness’ (i.e., does
the explanation provided by the NLP model accurately represent its behavior?) and ‘transparency’ (
i.e., is the explanation valuable for requirement analysts utilizing NLP models for decision-making?).
Often, NFRs intertwine with functional requirements, leading to subsequent changes at the
implementation level26[
          <xref ref-type="bibr" rid="ref6 ref7">, 27, 28</xref>
          ]. These scenarios hinder developers from holistically understanding how NFRs
interact and potentially impact functional aspects of th2e9c].oTdehi[s situation occasionally becomes
unavoidable, particularly when NFR modifications occur at the code-level. Conversely, the adherence
to good development practices for synchronizing or updating requirements at the document level is
not consistently enforced by develope3r0s].[ This naturally prompts essential questio‘hnosw: can
NLP methods efectively bridge this gap, especially in scenarios where intricate relationships exist between
NFRs and FRs at the code level and lack of their documentation at requirement level?’. Furthermore,
how can we employ NLP models to facilitate iterative refinement and clarification of NFRs when they are
intertwined with functional requirements at the code level?’. To illustrate this with a simple example,
consider a scenario where security can be further refined into confidentiality and integrity. These
subsets should undergo further refinement until specific design methods can be applied to satisfy and
comprehend the rationale behind requirements. The challenge lies in the opacity of the NLP model’s
decision-making process. Mere output predictions are insuficient in revealing the explanations behind
these decisions or the causal factors influencing these predictions. We assert that, for diverse NFR tasks,
the interpretability of NLP models is necessary.
• NFR Scenario #8 [
          <xref ref-type="bibr" rid="ref10">31</xref>
          ]: A software development team is building a virtual assistant system with
various voice functionalities, including setting reminders, sending messages, and controlling smart
home devices. One of the non-functional requirement‘sNiFsR-8: The virtual assistant should respond
to user commands accurately and with high speed, ensuring real-time interactions.’ In this scenario, the
NFR-8 of real-time interactions is closely interwoven with functional requirements for diferent tasks,
such as setting reminders or controlling smart home devices. The challenge arises when NLP models
are used to interpret user commands and generate responses for diferent tasks while considering the
real-time interaction constraint. NLP models, especially those based on deep learning or complex neural
networks, can be inherently opaque and lack interpretability. When NFRs such as real-time interactions
are tightly integrated with functional requirements, it becomes challenging to discern how the NLP
model processes and prioritizes tasks based on user inputs. For instance, if a user says, ‘set a reminder
for my meeting at 3 PM,’ the NLP model needs to accurately interpret the time constraint and prioritize
the task for a real-time interaction. However, due to the complexity of the NLP model, the developers
might not have clear insights into how the model processes the time constraint and makes decisions
about real-time interactions. This lack of transparency can lead to delays in setting the reminder or
even missing the real-time interaction requirement.
• NFR Scenario #9 [
          <xref ref-type="bibr" rid="ref11">32</xref>
          ]: A software development team is developing a chatbot to assist customers with
various financial banking tasks, including account inquiries, fund transfers, and investment advice. One
of the non-functional requirement‘sNiFsR-9: The chat-bot should ensure data privacy and compliance
with financial regulations while providing seamless user interactions.’ In this scenario, the NFR-9 of data
privacy and regulatory compliance is interwoven with functional requirements. The challenge arises
when NLP models are used to interpret customer queries and generate responses while considering
the data privacy and compliance constraints. For example, if a customer asks the chatbot to transfer
funds from one account to another, the NLP model has to ensure that the transaction is executed
securely and in compliance with financial regulations. Existing NLP models even the traditional ones
such as Random Forest model lack transparency in their decision-making process. Consequently, for
requirement analysts, comprehending how these NLP models handle sensitive information and ensure
compliance with financial regulations becomes exceedingly complex. Lack of interpretability in model
decisions can lead to data privacy breaches or regulatory violations, directly contradicting NFR-9.
• NFR Scenario #10 [
          <xref ref-type="bibr" rid="ref12">33</xref>
          ]: The autonomous medical diagnosis system is designed to analyze various
medical data: patient symptoms, lab test results, and medical history, to provide accurate diagnoses.
‘NFR-10: The system must ensure high interpretability of its predictions to build trust among healthcare
professionals and patients.’ But a patient is presented with a combination of symptoms (e.g., fever,
cough, and fatigue). The NLP system predicts a possible respiratory infection, but the doctor, with
years of experience, thinks the symptoms could also be indicative of a more severe condition. However,
the current NLP-based models used for diagnosis lack a human-in-the-loop component. Without a
human-in-the-loop component, the doctor cannot hone into the patient’s medical history or conduct
additional tests to validate the model’s predictions.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Research Directions: Roadmap</title>
      <p>The three forefront challenges that we discussed in Se3c.t1,ioSnection4.2, and Section3.3 have shown
to have implications when they are examined from the NFR context. Thus, we provide several research
directions and recommendations complemented by alternative strategies and solutions that can be
adopted in a flexible manner.</p>
      <sec id="sec-4-1">
        <title>4.1. Directions on NLP Semantic Soundness</title>
        <p>
          The recommendations and alternative solutions that we consider for leveraging NLP models within the
NFR domain when they are constrained by the challenges tiseedmtaontic similarityare as follows.
Recommended Approach #1: Directly integrate contextual information into NLP models to enhance
their semantic interpretability. This endeavor involves incorporating advanced contextual embeddings
[
          <xref ref-type="bibr" rid="ref13">34</xref>
          ] such as BERT [
          <xref ref-type="bibr" rid="ref14">35</xref>
          ] and RoBERTa [
          <xref ref-type="bibr" rid="ref15">36</xref>
          ] into the NLP models. These embeddings are able to capture
the contextual meanings of words and phrases. Subsequently, we can fine-tune the NLP models using
domain-specific NFR datasets. This step is essential to adapt to the specific contextual complexities
of the NFR domain. The subsequent step involves the adaptation of a domain-specific ontology that
encodes the relationships between terms and concepts within the NFR domain. This ontology is
instrumental in helping the NLP models comprehend the contextual connections among terms. To
integrate this ontology, we can employ two techniques. The first is the GN3T7]M,w[hich creates
semantic correlation graphs to capture the co-occurrence patterns of terms within NFR documents.
GNTM provides underlying semantic relationships among terms in diverse contexts. The second
technique is the DWGTM3[8], which extracts topics from semantic correlation graphs. Unlike GNTM,
DWGTM explores how the semantics of NFR terms evolve across various periods or contexts. By
combining GNTM and DWGTM, we can create semantic interpretability layers within the NFR context.
Recommended Approach #2: Develop domain-specific NLP models tailored explicitly to NFRs.
These models would be designed with a specific awareness of the intricate linguistic and semantic
structures inherent in NFR-related text. Specific steps that we recommend are the followings. Initially,
we should collect a diverse and extensive NFR dataset representative of the specific domain(s) of interest.
This dataset should entail a range of NFR types. Then, we should consider annotating the dataset to
provide clear labels indicating the semantic relationships between diferent NFRs, capturing various
levels of similarity and dissimilarity. Once that is done, we could develop NLP models that take into
account the inherent complexities of NFRs. This could involve using advanced architectures such as
transformers3[4], recurrent neural networ3k9s],[or hybrid models that incorporate both word and
phrase-level embeddings4[0]. Afterwards, we could train the NLP models on the domain-specific NFR
dataset, fine-tuning them to understand the specific linguistic context and semantic variations present
in NFRs. Finally, using a separate validation dataset, we can evaluate the domain-specific NLP models’
ability to capture NFR semantic similarity and compare their performance against traditional NLP
models to demonstrate the improvements.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Directions on NLP Ontology Generalizability for NFR</title>
        <p>
          Human feedback and expertise are crucial, especially for understanding the relationships and rationales
among NFRs, as well as their taxonomy relations with other NFRs. The same principle needs to
be applied to predictive NLP models, which are designed, viewed, and treated as black-box models.
However, existing NLP models have not been designed to incorporate a ‘humans-in-the-loop’ approach.
To address the interpretability problem in the context of NFRs, we require models that can interpret
output predictions and involve humans-in-the-loop for final decision-making. Frequently, we apply NLP
to requirements—such a model would make probabilistic inferences as to why a specific requirement is
detected as NFR-related or not. While these models often perform well, it is essential to understand
how NLP models process and reach such conclusions. If we closely examine Fig1u, rite illustrates a
scenario where an NLP model predicts that the requirement document is related to ’Reliability’ based
on correlated features selected by the NLP model in the document, such as ‘structuredness, conciseness,
error/fault, MTBF, consistency, accuracy.’ However, after the requirement analyst carefully reviews
the document and its NFR-related content, they may disagree with the prediction, considering it as an
‘understandability’ document. This disagreement arises because most features, such as ‘conciseness,
structuredness, legibility, and consistency,’ are subcategories of ‘understandability,’ despite the fact
that ‘consistency’ is a sub-category that belongs to both ‘understandability’ and ‘reliability.’ Without
explanations of how the NLP model arrived at these predictions, which cannot be solely revealed by
the output predictions, insights are lacking. Based on these observations, we recommend the following.
Recommended Approach #1: Leverage word embeddings and embedding-based techniques to
enhance the ontology. These techniques can enhance the ontology with richer semantic information
and provide contextual meaning of NFR terms. Specifically, Word2V4e1c],[GloVe [
          <xref ref-type="bibr" rid="ref21">42</xref>
          ], and/or FastText
[
          <xref ref-type="bibr" rid="ref22">43</xref>
          ] can be employed to train word embeddings that encapsulate contextual meanings. Subsequently,
these trained word embedding models can be applied to execute Named Entity Recognition4(N4]ER) [
on the NFR-related dataset. NER has the capability to detect and extract NFR-related entities such as
‘security,’ ‘performance,’ ‘usability,’ and others from the NFR-related dataset. Following this, text mining
can be utilized as a complementary technique to extract NFR-related terms both from the NFR dataset
and the output of the NER process. This procedure involves identifying concepts and phrases specific
to NFRs, which should then be incorporated into the NFR ontology. For example, the approach could
identify the dependency relationship between ‘security’ and ‘data encryption’ and add them into the
NFR ontology. This in turn, facilitates the reuse of foundational knowledge.
        </p>
        <p>Recommended Approach #2: Use compositional embeddings (CE) to improve NLP ontology
generalizability. CE creates phrase/sentence embeddings from individual words, capturing semantic
relationships within text. A hierarchical phrase embedding model can be trained on NFR data to
compose NFR-specific phrase embeddings, generalizing across domains even with limited NFR ontology
coverage. For example, it could recognize the similarity between “fluctuating transportation demands”
and “peak load.” Contextual adaptation can further update the ontology with new NFR knowledge. This
approach addresses the limitations of standard ontologies like WordNet, which often lack NFR-specific
terms (e.g., “ofline mode.”) However, CE efectiveness depends on the quality and quantity of NFR
training data and the complexity of NFR expressions.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Directions on NLP Interpretability and Human Feedback</title>
        <p>Recommended Approach #1: Use post-hoc local explanations, such as the ‘Local Interpretable
Model-Agnostic Explanations’ (LIME4)5[]. LIME’s goal is to train local models on specific instances,
obtain the model’s decisions for those instances, and then weigh them based on their proximity to the
instance being explained by humans. This is achieved by initially perturbing the inputs and subsequently
monitoring the outputs derived from this ‘black box’ to understand how predictive outcomes change.
Incorporate global explanations by providing explanations in the context of a system’s general behavior
through independent ‘if-then rule4s6’][. With this approach, the model learns essential rules that
elucidate the classification of particular instances, with these rules propagating from instance-level
to class-level rules. These rules are then processed to obtain the best set. In the realm of global
explanations, the primary interest is understanding how the model arrives at selecting the best set
and, more crucially, how it generates the final rule set for specific instances in the dataset. Another
approach to consider is incorporating Sequence-to-Sequence models for explaining NLP m4o7d]e. ls [
The Sequence-to-Sequence approach employs a variational auto-encoder to produce meaningful input
perturbations. Analyzing input variations through perturbing inputs is becoming a reliable method for
generating explainable predictive models. Furthermore, there have been advances in enhancing the
meaningfulness of word embeddings using word intrusio6n].[This work builds on the prior research
of Chang et al. [48], which interprets probabilistic topic models. A common method to interpret
embeddings is to enforce sparsity during traini4n9g].[
Recommended Approach #2: Build an interactive explanation. This involves implementing an
interactive explanation mechanism that ofers comprehensive insights into the decision-making process
of the NLP model. Users should be able to pose questions and seek explanations for the model’s
predictions. Throughout this process, incorporate uncertainty estimation techniques to quantify the
level of confidence in the model’s predictions. For instance, techniques like SHAP (SHapley Additive
exPlanations)5[0, 51] can be employed for such estimations. SHAP is capable of providing explanations
for individual predictions, allowing users to comprehend specific model outcomes. The subsequent
step would entail establishing a feedback loop involving domain experts to review the model’s outputs
and assess its interpretability. Their feedback can shed light on any deficiencies in the model. By
adopting these approaches, NLP models can be efectively integrated into the NFR domain. Both of
these recommendations aim to augment user comprehension, transparency, and trust in the model’s
predictions, thereby enhancing the user-friendliness and reliability of NLP applications.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This paper provides insights into the research gaps at the intersection of the NLP and NFR domains,
focusing on three key NLP challenges such as semantic soundness, ontology generalizability, and
interpretability. By raising awareness of these challenges, researchers can exercise greater caution
when leveraging NLP models as primary solutions within the NFR domain, especially in safety-critical
contexts. We illustrate these challenges with ten typical scenarios. Finally, we propose a research
agenda outlining recommendations and strategies to advance research within requirements engineering
and potentially the broader software engineering community.
Requirements Specifications and Lessons Learned, in: International Conference on Requirement
Engineering for Software Quality, Springer, 2013, pp. 80–95.
[4] C. Rosset, Turing-NLG: A 17-billion-parameter language model
by Microsoft, hhttps://www.microsoft.com/en-us/research/blog/
turing-nlg-a-17-billion-parameter-language-model-by-micr,oso20ft2/0. [Online; accessed
03/16, 2023].
[5] A. Kumar, P. Howlader, R. Garcia, D. Weiskopf, K. Mueller, Challenges in Interpretability of Neural
Networks for Eye Movement Data, in: Symposium on Eye Tracking and Applications, 2020, pp.
1–5.
[6] L. K. Şenel, I. Utlu, V. Yücesoy, A. Koc, T. Cukur, Semantic Structure and Interpretability of Word</p>
      <p>Embeddings, Transection on Audio, Speech, &amp; Language Processing 26 (2018) 1769–1779.
[7] S. Choudhary, N. Chatterjee, S. K. Saha, Interpretation of Black Box NLP Models: A Survey,
arXiv:2203.17081 (2022).
[8] M. T. Ribeiro, S. Singh, C. Guestrin, ”Why Should I Trust You?” Explaining the Predictions of Any
Classifier, in: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, 2016, pp. 1135–1144.
[9] K. S. Clark, Eficient and Scalable Transfer Learning for Natural Language Processing, Stanford</p>
      <p>University, 2021.
[10] W. Zhao, H. Peng, S. Eger, E. Cambria, M. Yang, Towards Scalable and Reliable Capsule Networks
for Challenging NLP Applications, arXiv:1906.02829 (2019).
[11] K.-W. Chang, H. He, R. Jia, S. Singh, Robustness and Adversarial Examples in Natural Language
Processing, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language
Processing, 2021, pp. 22–26.
[12] I. Mollas, N. Bassiliades, I. Vlahavas, G. Tsoumakas, Lionforests: Local Interpretation of Random</p>
      <p>Forests, arXiv preprint arXiv:1911.08780 (2019).
[13] D. Dermeval, J. Vilela, I. I. Bittencourt, J. Castro, S. Isotani, P. Brito, A. Silva, Applications of
Ontologies in Requirements Engineering: A Systematic Review of the Literature, Requirements
Engineering 21 (2016) 405–437.
[14] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer, Deep
Contextualized Word Representations, arXiv:1802.05365 (2018).
[15] F. Giunchiglia, M. Fumagalli, Entity Type Recognition–Dealing with the Diversity of Knowledge,
in: International Conference on Principles of Knowledge Representation and Reasoning, volume 17,
2020, pp. 414–423.
[16] Z. C. Lipton, The Mythos of Model Interpretability: In Machine Learning, the Concept of
iIterpretability is both Important and Slippery., Queue 16 (2018) 31–57.
[17] M. F. Lohmann, Liability Issues Concerning Self-Driving Vehicles, European Journal of Risk</p>
      <p>Regulation 7 (2016) 335–340.
[18] A. Stockel, Securing data and financial transactions, in: Proceedings The Institute of Electrical
and Electronics Engineers. 29th Annual 1995 International Carnahan Conference on Security
Technology, IEEE, 1995, pp. 397–401.
[19] A. Alessi, G. Ciccarelli, L. Cipolli, L. Guidotti, A. Marsano, A. Hanganu, Privacy by design and by
default in software development in order to prevent unlawful processing of personal data. privacy
certifications impact on software development and liabilities., 2021.
[20] J. A. Swets, Measuring the accuracy of diagnostic systems, Science 240 (1988) 1285–1293.
[21] J. Clarke-Salt, SQL injection attacks and defense, Elsevier, 2009.
[22] R. M. Thiyab, M. Ali, F. Basil, et al., The Impact of SQL Injection Attacks on the Security of
Databases, in: 6th International Conference of Computing and Informatics, School of Computing,
2017, pp. 323–331.
[23] M. Seufert, S. Egger, M. Slanina, T. Zinner, T. Hoßfeld, P. Tran-Gia, A survey on quality of
experience of http adaptive streaming, IEEE Communications Surveys &amp; Tutorials 17 (2014)
469–492.
[24] J. R. Ball, B. T. Miller, E. P. Balogh, Improving diagnosis in health care, National Academies Press,
[47] D. Alvarez-Melis, T. S. Jaakkola, A Causal Framework for Explaining the Predictions of Black-Box</p>
      <p>Sequence-to-Sequence Models, arXiv (2017).
[48] J. Chang, S. Gerrish, C. Wang, J. Boyd-Graber, D. Blei, Reading Tea Leaves: How Humans Interpret</p>
      <p>Topic Models, Advances in Neural Information Processing Systems 22 (2009).
[49] V. Trifonov, O.-E. Ganea, A. Potapenko, T. Hofmann, Learning and Evaluating Sparse iIterpretable</p>
      <p>Sentence Embeddings, 1809.08621 (2018).
[50] Y. Nohara, K. Matsumoto, H. Soejima, N. Nakashima, Explanation of machine learning models
using improved shapley additive explanation, in: Proceedings of the 10th ACM International
Conference on Bioinformatics, Computational Biology and Health Informatics, 2019, pp. 546–546.
[51] I. Ekanayake, D. Meddage, U. Rathnayake, A novel approach to explain the black-box nature
of machine learning in compressive strength predictions of concrete using shapley additive
explanations (shap), Case Studies in Construction Materials 16 (2022) e01059.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Neill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Laplante</surname>
          </string-name>
          , Requirements Engineering:
          <article-title>The State of the Practice</article-title>
          ,
          <source>IEEE Software 20</source>
          (
          <year>2003</year>
          )
          <fpage>40</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kassab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Neill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Laplante</surname>
          </string-name>
          ,
          <article-title>State of Practice in Requirements Engineering: contemporary data</article-title>
          ,
          <source>Innovations in Systems and Software Engineering</source>
          <volume>10</volume>
          (
          <year>2014</year>
          )
          <fpage>235</fpage>
          -
          <lpage>241</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S. F.</given-names>
            <surname>Tjong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Berry</surname>
          </string-name>
          ,
          <article-title>The Design of SREE-A Prototype Potential Ambiguity Finder for</article-title>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bulger</surname>
          </string-name>
          ,
          <article-title>Personalized learning: The conversations we're not having</article-title>
          ,
          <source>Data and Society</source>
          <volume>22</volume>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>R.</given-names>
            <surname>Krasniqi</surname>
          </string-name>
          ,
          <article-title>Detecting scattered and tangled quality concerns in source code to aid maintenance and evolution tasks</article-title>
          , in: 2023 IEEE/ACM 45th International Conference on Software Engineering: Companion
          <string-name>
            <surname>Proceedings (ICSE-Companion</surname>
            <given-names>)</given-names>
          </string-name>
          , IEEE,
          <year>2023</year>
          , pp.
          <fpage>184</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>R.</given-names>
            <surname>Krasniqi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Do</surname>
          </string-name>
          ,
          <article-title>Towards semantically enhanced detection of emerging quality-related concerns in source code</article-title>
          ,
          <source>Software Quality Journal</source>
          <volume>31</volume>
          (
          <year>2023</year>
          )
          <fpage>865</fpage>
          -
          <lpage>915</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>R.</given-names>
            <surname>Krasniqi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Do</surname>
          </string-name>
          ,
          <article-title>Capturing contextual relationships of buggy classes for detecting quality-related bugs</article-title>
          ,
          <source>in: 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME)</source>
          , IEEE,
          <year>2023</year>
          , pp.
          <fpage>375</fpage>
          -
          <lpage>379</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Eckhardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vogelsang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Fernández</surname>
          </string-name>
          , Are”
          <article-title>Non-functional” Requirements really Nonfunctional? An Investigation of Non-functional Requirements in Practice</article-title>
          , in: International Conference on Software Engineering,
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>I. J.</given-names>
            <surname>Jureta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Faulkner</surname>
          </string-name>
          , P.-Y. Schobbens,
          <string-name>
            <given-names>A More</given-names>
            <surname>Expressive</surname>
          </string-name>
          <article-title>Softgoal Conceptualization for Quality Requirements Analysis</article-title>
          ,
          <source>in: International Conference on Conceptual Modeling</source>
          , Springer,
          <year>2006</year>
          , pp.
          <fpage>281</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [31]
          <string-name>
            <surname>M. B. Hoy</surname>
          </string-name>
          , Alexa, siri, cortana, and
          <article-title>more: an introduction to voice assistants</article-title>
          ,
          <source>Medical reference services quarterly 37</source>
          (
          <year>2018</year>
          )
          <fpage>81</fpage>
          -
          <lpage>88</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>L.</given-names>
            <surname>Anaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Braizat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Al-Ani</surname>
          </string-name>
          ,
          <article-title>Implementing ai-based chatbot: Benefits and challenges</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>239</volume>
          (
          <year>2024</year>
          )
          <fpage>1173</fpage>
          -
          <lpage>1179</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sarwar Kamal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Ashour</surname>
          </string-name>
          ,
          <article-title>Large scale medical data mining for accurate diagnosis: A blueprint, in: Handbook of large-scale distributed Computing in smart healthcare</article-title>
          , Springer,
          <year>2017</year>
          , pp.
          <fpage>157</fpage>
          -
          <lpage>176</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chernyavskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ilvovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          , Transformers:“
          <article-title>the end of history” for natural language processing?</article-title>
          ,
          <source>in: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD</source>
          <year>2021</year>
          , Bilbao, Spain,
          <source>September 13-17</source>
          ,
          <year>2021</year>
          , Proceedings,
          <source>Part III 21</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>677</fpage>
          -
          <lpage>693</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          , arXiv (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          , V. Stoyanov,
          <article-title>RoBERTa: A Robustly Optimized BERT Pretraining Approach</article-title>
          , ArXiv (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>D.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xiong</surname>
          </string-name>
          , Topic Modeling Revisited:
          <article-title>A Document Graph-based Neural Network Perspective</article-title>
          ,
          <source>in: Neural IPS</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <article-title>Extracting Topics with Simultaneous Word Co-occurrence and Semantic Correlation Graphs: Neural Topic Modeling for Short Texts</article-title>
          , in: Association for Computational Linguistics,
          <year>2021</year>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>H.</given-names>
            <surname>Salehinejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sankar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Barfett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Colak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Valaee</surname>
          </string-name>
          ,
          <article-title>Recent advances in recurrent neural networks</article-title>
          , arXiv preprint arXiv:
          <year>1801</year>
          .
          <volume>01078</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhuang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <article-title>Phrase-level global-local hybrid model for sentence embedding</article-title>
          ,
          <source>in: Int'l Conference on Multimedia and Expo</source>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>K. W.</given-names>
            <surname>Church</surname>
          </string-name>
          , Word2Vec,
          <source>Natural Language Engineering</source>
          <volume>23</volume>
          (
          <year>2017</year>
          )
          <fpage>155</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          , Glove:
          <article-title>Global Vectors for Word Representation</article-title>
          , in: EMNLP,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>B.</given-names>
            <surname>Athiwaratkun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Wilson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anandkumar</surname>
          </string-name>
          ,
          <article-title>Probabilistic Fasttext for Multi-Sense Word Embeddings</article-title>
          , arXiv preprint arXiv:
          <year>1806</year>
          .
          <volume>02901</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          , W. Xia,
          <article-title>Overview of Named Entity Recognition</article-title>
          ,
          <source>Journal of Contemporary Education</source>
          <volume>6</volume>
          (
          <year>2022</year>
          )
          <fpage>65</fpage>
          -
          <lpage>68</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mishra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. L.</given-names>
            <surname>Sturm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dixon</surname>
          </string-name>
          ,
          <article-title>Local Interpretable Model-Agnostic Explanations for Muusic Content Analysis</article-title>
          ,
          <source>in: ISMIR</source>
          , volume
          <volume>53</volume>
          ,
          <year>2017</year>
          , pp.
          <fpage>537</fpage>
          -
          <lpage>543</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>N.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>On Interpretation of Network Embedding via Taxonomy Induction</article-title>
          ,
          <source>in: International Conference on Knowledge Discovery and Data Mining</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1812</fpage>
          -
          <lpage>1820</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>