Knowledge Representation of Requirements Documents
         Using Natural Language Processing

                        Aaron Schlutter                                Andreas Vogelsang
                  Technische Universität Berlin                  Technische Universität Berlin
                      DCAITI, Germany                                 DCAITI, Germany
                  aaron.schlutter@tu-berlin.de                   andreas.vogelsang@tu-berlin.de


                                                       Abstract
                       Complex systems such as automotive software systems are usually bro-
                       ken down into subsystems that are specified and developed in isolation
                       and afterwards integrated to provide the functionality of the desired
                       system. This results in a large number of requirements documents
                       for each subsystem written by different people and in different depart-
                       ments. Requirements engineers are challenged by comprehending the
                       concepts mentioned in a requirement because coherent information is
                       spread over several requirements documents. In this paper, we describe
                       a natural language processing pipeline that we developed to transform
                       a set of heterogeneous natural language requirements into a knowledge
                       representation graph. The graph provides an orthogonal view onto
                       the concepts and relations written in the requirements. We provide
                       a first validation of the approach by applying it to two requirements
                       documents including more than 7,000 requirements from industrial sys-
                       tems. We conclude the paper by stating open challenges and potential
                       application of the knowledge representation graph.


1    Introduction
In practice, complex systems are often developed in terms of several loosely coupled subsystems. Requirements
specifications often exist, if at all, only on the level of subsystems. Mostly written in natural language, the
requirements documents contain the specific knowledge necessary to develop the single subsystem. With the
advent of more complex features that require several subsystems to interact, it is getting harder for requirements
engineers to comprehend the relevant knowledge that is spread over several requirements documents.
   Automotive software systems are a good example of this phenomenon. The different subsystems within a
car are usually developed by completely separate teams that employ their own requirements engineering process
and documentation with a focus on their own subsystem and limited knowledge about the relations to other
subsystems. In an earlier study, we have shown that developers of single subsystems of an automotive system
are not aware of more than half of the dependencies that their subsystem has to other subsystems [VF13].
   In this paper, we describe a natural language pipeline that transforms a set of natural language statements
(e.g., requirements) into a knowledge representation graph. The purpose of this knowledge representation graph
is to summarize and structure all concepts and relations contained in the requirements over all subsystem
specifications. The graph represents this knowledge by a set of triples inspired by RDF (Resource Description

Copyright c 2018 by the paper’s authors. Copying permitted for private and academic purposes.
Framework)1 . To create the graph, our NLP pipeline first performs a number of preprocessing steps on the
requirements to normalize the data and then uses Semantic Role Labeling [CM04] to extract single triples. The
triples are afterwards merged into a single knowledge graph.
   As an early validation, we applied the pipeline to two requirements documents, one academic example with 150
requirements and one industrial document with over 7,000 requirements. Both documents contain requirements
from the automotive domain. The resulting knowledge graph for the smaller document contains 345 nodes,
while the graph for the larger document consists of over 13,000 nodes. These results show how much conceptual
information is contained in requirements documents and motivates why knowledge representation may help
analyzing the semantics embedded in the requirements documents.
   Although we have not yet elaborated on specific applications of this knowledge graph, we envision benefits
for assessing the amount of implicit knowledge in requirements documents, trace link generation, and support
for reviewing requirements documents.
   At the end of this paper, we detail some of the challenges that need to be solved to leverage the full potential
of this knowledge representation including identification of semantic similarity and representation of causal
relationships.

2     Background
Knowledge representation is useful in many disciplines, whenever collecting, managing and sharing information
and facts is necessary. The form of knowledge representation may vary from expressing knowledge by natural
language to technical solutions that aim at creating complex logical views on the information. As a technical
term, the topic evolved in the context of artificial intelligence and reasoning systems in the 1950’s. Knowledge
representation focuses on the depiction of information that enables computers to solve complex problems.
   Knowledge representation supports users and computers to handle large amount of information. The basic
idea of knowledge representation systems is to store complex information or facts in a knowledge base and
make them available for various application. Therefore, searching the knowledge base is a crucial task. How
information can be queried depends on the way the information is stored. Reasoning systems, for example, aim
at extracting information to deduce new facts by an inference engine. Deduced information can be added to the
knowledge base. These systems and their knowledge representation are often very formal and limited to solve
specific constraints.
   One possibility to represent knowledge is provided by the RDF. In RDF, each statement/fact is represented
by a triple that contains a subject, an object, and a predicate describing the relationship between them. The
parts of an RDF triple are also categorized into different classes, e.g., a subject can be an instance of the class
resource or of the class literal. One triple’s object can be another triple’s subject which allows to concatenate
multiple triples. A set of triples can be represented as an RDF graph where subjects and objects are represented
as nodes and predicates as edges.
   In addition to RDF, it is possible to define an RDF schema (RDFS) and build up an ontology. Such a
schema contains specifications about the classes used in an RDF model. Therefore, subclasses are defined, e.g.,
of resource or literal specifying a limited value range or defining permissible relationships to other classes.
   Knowledge bases can be built manually or extracted automatically from other knowledge representations. If
the source for such an extraction is natural language, NLP pipelines are usually used for information extraction.
In case of building up an ontology, this process is called ontology learning and is often semi-automatic. Due to
the ambiguity of natural language, those pipelines usually result in knowledge representations that are imprecise
and incomplete up to a certain point. Despite these shortcomings, such structural semi-formal knowledge bases
enable computers to execute a wider range of algorithms.
   However, for knowledge contained in industrial requirements specifications, a (semi-)automatic extraction is
often the only chance for proper knowledge representation.

2.1    Related Work
Borgida et al. stated already in 1985 that knowledge representation is the basis for requirements engineer-
ing [BGM85]. They argue that in addition to the functional specification of a system, it is essential to have more
knowledge about its environment, i.e., “knowledge of terms, technical and scientific rules, everyday procedures
and conventions”. The proposed Requirements Modeling Language (RML) can be seen as a predecessor of an
    1 https://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/
ontology while they proposed to use such a model as the basis to exchange knowledge in a formal, yet easy to
understand, way.
   OntoLearn [VFN13], a pipeline for ontology learning, consists of several steps to learn the taxonomy of an
ontology. At first, they identify and extract terms inside a corpus, search for their hypernyms using additional
knowledge like web glossaries and documents (e.g., WordNet2 ), and build up a “noisy hypernym graph”. Af-
terwards this graph is filtered by given upper terms of a specific domain. Lastly, this graph is pruned to a
tree-like taxonomy to reduce cycles and hypernyms for most nodes. In case important edges were pruned, they
are optionally recovered at the end. This workflow results in a directed acyclic graph.
   Browarnik and Maimon proposed to build graphs via the concept “from clauses or subsentences to RDF triples
and RDFS” [BM15]. They argued that the common attempt for ontology learning, the so called layer cake, is
not useful because each step (layer) analyzes the whole corpus at a single point of view which is very often faulty
and this error is passed on to all depending layer. This results in a generally low recall and precision (less than
0.6). To circumvent this shortcoming, they propose alternative models for ontology learning by reviewing and
evaluating existing papers.
   One use of ontologies in the automotive context is recommendation-based decision support [HK16]. They
are searching for similar already finished safety analyses to support a human expert in hazard analysis and risk
assessment, prescribed by the ISO 26262 standard. While the ontology containing extensive expert’s knowledge,
spreading activation as a highly configurable semantic search algorithm is used to query the knowledge base for
relevant information.
   Dermeval et al. report on the use of ontologies in requirements engineering in their systematic literature
review [DVB+ 16]. They reviewed 67 papers from academic and industrial application contexts, dealing with
different types of requirements and identified and categorized typical problems solved by ontologies in different
RE phases. Most citations deal with problems like (1) ambiguity, inconsistency, and/or incompleteness, (2)
requirements management/evolution and (3) domain knowledge representation. While only 34% of the papers
reuse existing ontologies in their solutions, most of them specified their own ontology. The largest number of
papers in the review rely on textual requirements as the RE modeling style, especially in the specification phase.

3     Natural Language Processing Pipeline
Figure 1 shows our pipeline to build a graph from natural language statements. The core parts of our pipeline,
Stanford CoreNLP [MSB+ 14] and SENNA [CWB+ 11] for semantic role labeling, are pipelines themselves and
will be described in detail in the following subsections along with the graph building.

                               Tokenizing/       Part-of-                 Semantic
                                                                Lemma-                    Graph
                Filtering       Sentence         Speech                     Role
                                                                 tizing                  Building
                                Splitting        Tagging                  Labeling


                                             Stanford CoreNLP              SENNA

      Figure 1: NLP pipeline to transform NL requirements specifications to knowledge representation graphs

   The current pipeline only supports English as natural language. Therefore, we do some filtering to remove
statements, that contain languages other than English. Information representation such as tables or images are
not interpretable by the pipeline and are also ignored.
   The graph building at the end of the pipeline requires tokens annotated with semantic roles, lemma and
part-of-speech. Lemmatizing requires already annotated part-of-speech tokens, while semantic role labeling only
requires tokens itself. Since SENNA as a standalone tool requires tokenized sentences as input, Stanford CoreNLP
is positioned prior to it.

3.1     Stanford CoreNLP
The Stanford CoreNLP is a collection of common NLP tasks, assembled in a pipeline3 . We are using parts
of it, particularly the tokenization, sentence splitting, part-of-speech tagging and lemmatizing (morphological
analysis).
    2 https://wordnet.princeton.edu/
    3 https://stanfordnlp.github.io/CoreNLP/
   The first two tasks determine all tokens (words, punctuation marks, etc.) and sentences in a natural language
text. Part-of-speech tagging categorizes the tokens by their grammatical role inside of a sentence, e.g., as noun,
verb or article. Lastly, each word is annotated with its lemma, a basis version depending on the part-of-speech tag.
For example, the lemma of “walking” (verb) is “walk” (verb), the lemma of “walkers” (noun) is “walker” (noun).
In contrast to this, stemming of these words would result in “walk” as the word stem.

3.2    Semantic Role Labeling
Semantic role labeling (SRL) [CM04] is a sentence-based NLP task that consists of two steps. In the first step,
all predicates or verbs of a sentence are determined. In the second step, all semantic arguments of each verb
are associated/classified with their roles within this sentence. In general, the first argument is called the agent,
the second the patient, all other semantic roles depend on the verb. For example, “John broke the window with
a rocket”, contains the verb “broke”. Its arguments are [A0 John], [A1 the window] and [AM-MNR with a rocket].
The numbered arguments would be the same even if the syntax of the sentence would change, e.g., “[A1 The
window] was [V broken] [AM-MNR with a rocket] [A0 by John]”. The semantic roles are predefined in a database
called PropBank4 . As stated in [CM05, Table 1], most of the identified propositions have at least one argument,
about two-third also have a second argument.
   The CoNLL (Conference on Computational Natural Language Learning) shared task for semantic role labeling
was first published in 2004 [CM04] and extended in 2005 [CM05]. Within these competitions, the PropBank
corpus was annotated with semantic roles as training data. While the top submission of the first shared task
receives a precision of 72.43% and a recall of 66.77%5 , the top submission in 2005 already performs with a
precision of 81.18% and a recall of 74.92% with an enlarged corpus6 . While the competitions are conceived to
run just for several months and further submissions are no longer possible, there are still new systems evolving,
which are compared under those conditions [GCW+ 16, FTGD15, PP14].
   We use SENNA7 as state-of-the-art implementation for SRL tagging. It uses a unified neural network archi-
tecture to perform SRL and achieved an F1 score of 75.49% for CoNLL in 2005.

3.3    Building the graph
As last step of the pipeline, an exporter builds a graph from the annotated natural language. The exporter
searches the text for SRL verbs and their arguments. If at least a first and second argument exist, it creates
an RDF triple with the first SRL argument as subject, the second one as object, and the verb as predicate.
All other arguments are stored as metadata of the triple inside the relation. If there is no second argument, a
reflexive relation is built and exported.
   SRL annotates the tokens in a sentence as they are (like “the window” or “broke”). However, this is not
useful when building a graph with nodes and edges identified by unique labels. This would lead to duplicates
of nodes and edges, if there are multiple occurrences of the same phrase in different grammatical variation. To
circumvent this issue, the exporter also trims the words by using their lemma or removing articles on the basis
of their part-of-speech tags.

4     Early Validation
We applied the pipeline from Section 3 to two requirements documents. Both have an automotive background
and contain real industrial data but they differ in size and do not relate to each other. For this reason, we applied
the pipeline to each of them separately. The first one is the system requirements specification Automotive System
Cluster (ASC) 8 , which describes two subsystems, ELC (exterior light system) and ACC (adaptive cruise control).
The document is a public case study, which is derived from real requirements but largely reduced and modified
by the original authors in size and detail. The other document is provided by our industry partner, describing
a single subsystem called “Charging of electric vehicles”. Table 1 contains the central measures about the two
requirements documents and the knowledge representation graphs that we extracted for both documents.
   The graphs for both requirements documents are shown in Figure 2. The graph from Figure 2a has different
colors representing from which of the two contained systems the nodes and edges originate. Nodes and edges
    4 https://verbs.colorado.edu/
                                 ~mpalmer/projects/ace.html
    5 https://www.cs.upc.edu/
                             ~srlconll/st04/st04.html
    6 https://www.lsi.upc.edu/ srlconll/st05/st05.html
                              ~
    7 https://ronan.collobert.com/senna/
    8 https://www.aset.tu-berlin.de/fileadmin/fg331/Docs/ASC-EN.pdf
                                            Table 1: Study objects

                                                          ASC      Charging System
                          requirements                       153                 7,238
                          authors                              4                  >66
                          sentences                          153                 9,213
                          tokens (words and marks)        4,140               186,478
                          nodes [main graph]           345 [149]        13,676 [6,702]
                          edges [main graph]           440 [228]       19,769 [12,715]

extracted from ELC are colored grey, ACC nodes and edges are light grey. If they are contained in both systems,
they are colored black.


          (a) Automotive System Cluster (ASC)                             (b) Charging System

           Figure 2: Knowledge representation graphs extracted from two requirements documents
   Both graphs in Figure 2 show some similarities as both contain a “main graph” in the upper left with a lot
of nodes and edges linked to each other, several “sub graphs” around the “main graph” with some nodes, and
edges a lot of “single graphs” with just two or three edges on the lower part of the figures. The number of nodes
and edges for each graph is stated in Table 1, as well as the sum of the “main graph” and the “sub graphs” as
[main graph].
   While most nodes and edges represent concepts and relations that are only contained in one system, there
are some black nodes. Two of them in the middle of the main graph are “vehicle” and “driver”. They are
central because they are typically stated in almost every requirements document of automotive software. The
other black nodes represent “ASIL-B”, which stands for an automotive safety integrity level, and “function”
respectively “user function”, which exists in both systems as a common phrase.
   These results cannot be interpreted correctly in detail due to a lack of corresponding quality criteria and
currently non-existing use cases. Of course, the pipeline is just a prototype which not able to extract every
information correctly nor to build a valid ontology or RDF graph. Nevertheless, it is observable from Figure 2a,
that the two systems have no explicit described, conjointly interfaces. This observation can also be found in the
original text document.
   The sub graphs and single graphs contain information without any connection to the main graph, which
indicates, that this connection is not explicitly stated in the requirements document. One reason might be, that
this is knowledge the author considers to be common and therefore not documented.

5   Open Challenges
As stated in Section 4, our approach is not able to capture every bit of information contained in the original
requirements documents. The major reason for this is the loss that you have to consider whenever you are using
NLP techniques on real data. Each step in the pipeline is prone to some errors (missed or incorrectly identified
information). Each requirements document has its own structure and is often extended with additional metadata.
The pipeline does not consider these characteristics of a document as it interprets each sentence independently
step by step. One of the challenges that we want to consider for the future is to integrate structural properties
of the document into the knowledge graph (e.g., sentences within one requirement, requirements arranged under
one headline).
Causal Relations: Requirements, especially functional ones, often describe causal relationships like “If [A]
then [B]”. If [A] contains a subject, predicate, and object, SRL annotates those accordingly and annotates the
words in [B] as a single adjunct argument like general-purpose or cause. Currently, we do not consider such
causal relations as the graph is build only based on the first two arguments of the SRL. Integrating such causal
relations the current graph model is not possible since [A] is already a graph which cannot be related (connected
via an edge) to another [B] which is a graph itself. Causal relations that we extract from the requirements may
be represented as inference rules in RDF schema.
Semantic Similarity: Another problem we observed in the data is, that some nodes and edges seem redundant
because different words or phrases are used with the same meaning. Some of them are just wording issues like
“speed set point of the cruise control” and “cruise control speed set point” in ASC, but there are also other
difficulties like “vehicle” and “car”. A major challenge for our approach is to identify such semantic similarities
to keep the knowledge graph free of redundancies. The according research field dealing with these kind of
challenges is called semantic similarity. Determine a semantic similarity between words or phrases would enable
us to identify edges which are identically in the meaning of their concept. These edges could be connected to
each other or merged into a single one and would result in a more connected “main graph” and less “sub graphs”
or “single graphs”.
   There are already existing approaches to identify the semantic similarity. A very promising solution is “Align,
Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity” [PJN13]. It is applicable at
multiple lexical levels, from single words up to complete texts, comparing each word within its semantic signature.
A disadvantage to our use case is the dependency on WordNet as we want to analyse requirements specifications
from the automotive context which is not part of WordNet in detail.
   Another opportunity are word embeddings like word2vec [MCCD13] or GloVe [PSM14]. They calculate one-
dimensional vectors to represent words. This enables a comparison to the similarity of these vectors. One
downside of them is, that the calculation requires a lot of data to build appropriate vectors. Also, these word
embeddings rely on the distributional hypothesis, the assumption that semantically similar words appear in
similar contexts, which is not always the case [MST+ 16].
No Ontology: As stated before, the graph does not meet the RDF specification nor is a valid ontology. The
nodes and edges are not classified or categorized, e.g., into literal or property in RDF graph, nor does the current
pipeline examine, whether a node or edge is part of the schema or must be part of the instance level. To leverage
the full potential of knowledge reasoning, we need to enhance the data with such additional metadata. Depending
on the application of this pipeline, this could be done with a predefined schema where the data should fit in or
those information could be extracted from the graph itself.

6   Potential Applications
Once we have a good representation of the knowledge contained in a number of requirements documents, we see
a number of potential applications.
Knowledge querying: In semantic networks, a requirements engineer can efficiently search for specific in-
formation that is contained in any requirements document. In contrast to simple full-text search, a user can
express more complex and semantically enriched queries such as “Return all subsystems that interact with my
subsystem”. Besides querying the semantic net for information that is already encoded in the network, there
are also algorithms that can be used to query for associated knowledge (e.g., spreading activation [Cre97]). This
allows the user to state queries such as “Return a list of the most relevant stakeholders given a certain feature”.
If these advanced querying features are integrated in an interactive tool, requirements reviewers may greatly
profit from that.
Trace link generation: There is already a number of papers that propose using information retrieval techniques
for trace link recovery [HDO03]. In such approaches, single requirements are formulated as queries to find related
requirements in a large set of requirements documents. The resulting requirements are considered trace link
candidates. With our approach, we may improve these approaches by leveraging the structure of the knowledge
graph.
Integration of other NL sources: Requirements documents are not the only natural language documents
that contain knowledge about the system to be developed. We plan to integrate knowledge from other natural
language artifacts such as system test cases or safety analysis documents into the knowledge representation graph
as well.
Detection of implicit knowledge: Implicit knowledge is a potential source of misunderstandings and errors.
When requirements authors assume that certain knowledge is obvious or clear to the reader, they do not document
this knowledge explicitly. However, sometimes, this assumption is not true. Knowledge representation graphs
can indicate implicit knowledge by small subgraphs with concepts that are not related to other concepts. This
indicates that the authors presume that additional relations of these concepts are clear to the reader. Another
possibility to detect implicit knowledge is to compare the total number of relations that a concept has over all
documents with the number of relations that this concept has in one document. Concepts with many relations
are “rich” concepts in the sense that authors provide much information about these concepts in their documents.
If there is only little information about such “rich” concepts provided in one specific document (i.e., only a small
number of relations originate from one document), this may indicate that the author assumed much implicit
knowledge.

7   Conclusions
In this paper, we have proposed an NLP pipeline for extracting knowledge from requirements documents and
expressing this knowledge as a graph. We use Semantic Role Labeling within the pipeline to extract RDF triples
from requirements sentences. As a first validation, we have applied the NLP pipeline to two requirements docu-
ments, one academic example with 150 requirements and one industrial document with over 7,000 requirements.
The validation shows that both resulting knowledge graph consists of one main subgraph that contains and con-
nects between 40% and 50% of all nodes and several smaller subgraphs with less than 10 connected nodes. In the
future, we plan to enhance the approach by identifying semantic similarities between concepts and integrating
causal relations expressed in requirements to enable reasoning about the knowledge. We have identified a number
of potential applications, where such a knowledge representation may support RE tasks.

References
[BGM85]     Alexander Borgida, Sol Greenspan, and John Mylopoulos. Knowledge representation as the basis
            for requirements specifications. Computer, 18(4), 1985.

[BM15]      Abel Browarnik and Oded Maimon. Ontology learning from text: Departing the ontology layer cake.
            International Journal of Signs and Semiotic Systems, 4(2), 2015.

[CM04]      Xavier Carreras and Lluı́s Màrques. Introduction to the CoNLL-2004 shared task: Semantic role
            labeling. In Computational Natural Language Learning (CoNLL), 2004.

[CM05]      Xavier Carreras and Lluı́s Màrquez. Introduction to the CoNLL-2005 shared task: Semantic role
            labeling. In Computational Natural Language Learning (CoNLL), 2005.

[Cre97]     Fabio Crestani. Application of spreading activation techniques in information retrieval. Artificial
            Intelligence Review, 11(6), 1997.
[CWB+ 11] Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel
          Kuksa. Natural language processing (almost) from scratch. Computing Research Repository (CoRR),
          abs/1103.0398, 2011.

[DVB+ 16] Diego Dermeval, Jéssyka Vilela, Ig Ibert Bittencourt, Jaelson Castro, Seiji Isotani, Patrick Brito,
          and Alan Silva. Applications of ontologies in requirements engineering: A systematic review of the
          literature. Requirements Engineering, 21(4), 2016.
[FTGD15] Nicholas FitzGerald, Oscar Täckström, Kuzman Ganchev, and Dipanjan Das. Semantic role labeling
         with neural network factors. In Conference on Empirical Methods in Natural Language Processing
         (EMNLP), 2015.
[GCW+ 16] Jiang Guo, Wanxiang Che, Haifeng Wang, Ting Liu, and Jun Xu. A unified architecture for seman-
          tic role labeling and relation classification. International Conference on Computational Linguistics
          (COLING), 2016.

[HDO03]    Jane Huffman Hayes, Alex Dekhtyar, and Jeffrey Osborne. Improving requirements tracing via
           information retrieval. In 11th IEEE International Requirements Engineering Conference (RE), 2003.
[HK16]     Kerstin Hartig and Thomas Karbe. Recommendation-based decision support for hazard analysis
           and risk assessment. In Eighth International Conference on Information, Process, and Knowledge
           Management (eKNOW), 2016.

[MCCD13] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word represen-
         tations in vector space. Computing Research Repository (CoRR), abs/1301.3781, 2013.
[MSB+ 14] Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David
          McClosky. The stanford CoreNLP natural language processing toolkit. In Association for Compu-
          tational Linguistics (ACL) System Demonstrations, 2014.

[MST+ 16] Nikola Mrksic, Diarmuid Ó. Séaghdha, Blaise Thomson, Milica Gasic, Lina Maria Rojas-Barahona,
          Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, and Steve J. Young. Counter-fitting word vectors
          to linguistic constraints. Computing Research Repository (CoRR), abs/1603.00892, 2016.
[PJN13]    Mohammad Taher Pilevar, David Jurgens, and Roberto Navigli. Align, disambiguate and walk: A
           unified approach for measuring semantic similarity. In 51st Annual Meeting of the Association for
           Computational Linguistics (ACL), 2013.
[PP14]     Gerhard Paass and Bhanu Pratap. Semantic role labeling using deep neural networks. In Interna-
           tional Workshop on Representation Learning (RL), 2014.
[PSM14]    Jeffrey Pennington, Richard Socher, and Christopher D. Manning. GloVe: Global vectors for word
           representation. In Conference on Empirical Methods in Natural Language Processing (EMNLP),
           2014.
[VF13]     Andreas Vogelsang and Steffen Fuhrmann. Why feature dependencies challenge the requirements
           engineering of automotive systems: An empirical study. In 21st IEEE International Requirements
           Engineering Conference (RE), 2013.

[VFN13]    Paola Velardi, Stefano Faralli, and Roberto Navigli. OntoLearn reloaded: A graph-based algorithm
           for taxonomy induction. Computational Linguistics (CompLing), 39(3), 2013.