R2BC : Tool-Based Requirements Preparation for Delta Analyses by Conversion into Boilerplates Konstantin Zichler Steffen Helke Advanced Engineering Projects Safe and Secure Software Systems HELLA GmbH & Co. KGaA Brandenburg University of Technology Lippstadt, Germany Cottbus-Senftenberg, Germany konstantin.zichler@hella.com steffen.helke@b-tu.de Abstract—Automotive OEMs and suppliers negotiate different both requirements specifications are used for the estimation documents before they sign contracts for a product development. of the effort necessary, to realize the successor product. The This includes the Component Requirements Specification (CRS), advantage of this approach is that the effort for the realization which is submitted by the OEM. The CRS describes the charac- teristics of the product to be developed in detail and is therefore of the predecessor product is already known, and hence the the basis for the development effort estimation of a supplier. If effort for the adaptations, which are necessary to realize the the specified component is a successor of an already available successor product can be estimated. The below listed example product, the requirements specifications of both the successor illustrates a deviation of two similar requirements : and the predecessor products can be compared to estimate the development effort for the new component. This activity is called 1) If the combustion engine is running, the function ECU delta analysis. Due to a lack of sufficient tool support, the delta self-diagnosis shall be active. analysis is still a predominantly manual task. The main reason for this is, that the documents to be compared are structurally 2) If the vehicle battery is charging, the function ECU self- too different. In this work, we introduce a new method for an diagnosis shall be active. automated conversion of an OEM’s unstructured or otherwise The underlined parts of the requirements sentences highlight structured CRS into a structured language used by the supplier. The process uses established NLP tools to analyze CRS and the delta. In this example the additional working time of then translates the OEM’s requirements into supplier-specific the ECU self-diagnosis function is to be considered. This boilerplates using a newly developed technique. The concept is can lead to additional development effort and may even implemented with the R2BC prototype, which demonstrates the require new components within the vehicle. These add-ons feasibility of the approach and enables the processing of first real are subsequently subject to effort estimation activities of the CRS. Index Terms—Requirements engineering, boilerplates, natural supplier and hence the basis for price indication for the RFQ. language processing, delta analysis Due to a lack of sufficient tool support, the delta analysis is still a predominantly manual activity. Requirements Engineers experience the comparison of two documents with roughly 100 I. I NTRODUCTION to 300 or more pages each, as tedious and time-consuming. During the early phase of sourcing, OEMs submit Requests In our opinion this time should rather be invested in creative for Quotation (RFQ) to automotive suppliers. Among other work, which produces higher value-added for the company. It documents, this request includes the Component Requirements is for this reason that our development activities are focused Specifications (CRS). The CRS describes the properties of on a novel approach for an automated delta analysis. From the component, which shall be developed. The RFQ prompts the experience we gained during our previous work [13], we the supplier to offer the specified component at a certain know that an automated delta analysis would attain a higher price within a limited amount of time. In order to provide accuracy, if sentences that are compared with each other have the OEM with an offer, the supplier has to first estimate the same syntax. the necessary effort to develop the requested component. If A continuously equal syntax can be reached during the the supplier has already developed similar parts in the past, documentation of requirements by usage of boilerplates. A the former specification documents can be used to estimate boilerplate is a blueprint that determines the syntactical struc- the development effort. In this case requirements engineers ture of a single requirement [11]. Boilerplates compliant perform a delta analysis. The delta analysis refers to the requirements have the same structure, if the same type of activity of comparing two requirements specifications to de- boilerplates is used. That means that certain elements of a termine the differences, namely the deltas, between the listed sentence appear in the same order within all requirements of requirements. It is a common procedure, in case a successor of the same type. This fact enables a machine to compare certain an already available product is to be developed and the require- parts of requirements and to determine the deviation, or in ments specifications of both the successor and the predecessor other words, the delta. In practice, companies use different products are available. The results of the comparison between boilerplates. Before requirements engineers of the supplier ASE 2019: 16th Workshop on Automotive Software Engineering @ SE19, Stuttgart, Germany 45 can benefit from an automated delta analysis, the submitted to sentences leads to the following two requirements : OEM requirements have to be converted into boilerplates (1) “The ECU shall monitor the liquid temperature.” and compliant requirements first. This is especially the case, when (2) “The ECU shall store the liquid temperature data.” requirements engineers at the supplier side use company- specific boilerplates. It is important to mention, that the time schedule for responding to an RFQ is very tight. This fact No. Examples makes a manual conversion of requirements into boilerplates Restructuring of a sentence (e.g. from passive to active form) unfeasible. Input The . . . . .liquid . . . . . . .temperature . . . . . . . . . . . . shall be monitored by 1 the ECU. A convenient solution for this problem can only be achieved by a suitable tool support. This tool support should be eco- Output The ECU shall monitor the . . . .liquid ...... nomically reasonable by requiring the least amount of time temperature. ............ and personnel deployment for the conversion to boilerplates. Adaption of words Besides these requirements, further challenges for the tool Input The temperature of the liquid shall be sent . . . . by support arise from the quality of the submitted requirements. the temperature sensor to the monitoring 2 system. In literature certain quality criteria are known : atomicity, Output The temperature sensor shall send . . . . . the correctness, completeness, unambiguousness etc. [8]. Requi- temperature of the liquid to the monitoring rements boilerplates by their mere structure are designed system. to support requirements quality [11]. Many random natural Atomization of requirements language requirements do not meet these quality criteria and Input The ECU shall monitor and store the liquid hence do not fit accurately into the predefined boilerplates. temperature. Hence, natural language requirements have to be reshaped by 3 Output The ECU shall monitor the liquid the tool support, before they can be converted into boilerplates. temperature. The ECU shall store the liquid Table I gives an overview of the tasks, that should be ac- temperature data. complished during the conversion of random natural language Identification on an actor in indefinite wording requirements into boilerplates. In the following, we elaborate Input It shall be ensured that the liquid temperature on selected examples : 4 does not exceed the threshold of 120◦ C. Output The ECU shall ensure that the liquid 1) Restructuring of a sentence : Example 1 (Table I) temperature does not exceed the threshold of shows an input requirement, which is written in the 120◦ C. passive form. The conversion of this requirement into Resolving co-reference an active form requires a rearrangement of the sentence Input The ECU shall monitor the liquid parts. The output requirement after the conversion is temperature. It shall store the temperature read as follows : “The ECU shall monitor the liquid 5 data. temperature.” Output The ECU shall monitor the liquid temperature. The ECU shall store the liquid 2) Adaptation of words : Once the sentence parts of a requi- temperature data. rement were rearranged, some words of this requirement TABLE I need adaptation (see Example 2 in Table I). The phrase : E XAMPLES FOR C ONVERSION TASKS “The temperature of the liquid.”, previously started with a capital letter. Now this phrase is located at the middle Among others, [6] and [7] present methods and tool support of the requirements sentence. This requires an adaptation for the documentation of requirements with boilerplates right of the word “The”, which is now written with a lower- from the beginning. The majority of CRS in industry are case letter “the”. based on different styles of boilerplates or do not comply 3) Atomization of requirements : Example 3 (Table I) to boilerplates at all. This means, that automotive suppliers contains two requirements. This is proven by the two receive already documented requirements with this kind of process words, which describe two different functions characteristics. Therefore, available approaches, which require of the ECU. A proper conversion into boilerplates re- the use of boilerplates during the documentation of requi- quires a split of the requirement into the following two rements, cannot help to overcome issues that arise while requirements : (1) “The ECU shall monitor the liquid handling finalized specifications. temperature.” and (2) “The ECU shall store the liquid It is for these reasons, that we suggest a semi-automated temperature data.” conversion of random natural language requirements to pre- 4) Resolving co-reference : The second sentence in defined boilerplates. Our tool the Requirements to Boiler- Example 5 (Table I) : “It shall store the temperature plates Converter (R2BC) converts randomly formulated na- data.” addresses the ECU and by this constitutes a tural language requirements, which are provided in various second requirement. The conversion of this second sen- document formats into predefined boilerplates. Alongside with tence into a requirement requires a specification of the the conversion, the R2BC concept aims at the rectification corresponding actor. The accurate conversion of these of requirements flaws to increases requirements quality. Our ASE 2019: 16th Workshop on Automotive Software Engineering @ SE19, Stuttgart, Germany 46 solution involves natural language processing (NLP) tech- analysis, manipulation, and generation of natural language niques and a proprietary developed prototype of the R2BC. [9]. We use the General Architecture for Text Engineering Moreover, in this work we provide future users of the R2BC (GATE) [3] to run natural language pre-processing on the with the corresponding methodology for the realization of a requirements documents. GATE is an open-source software, semi-automated conversion of requirements into boilerplates. which is mainly used to annotate text, either manually or The basic aim of the technology presented in this work, is automatically. A wide range of applications, which are called to make requirements machine-readable. Our tool, the R2BC processing resources, are available under GATE. We explain is a prerequisite for the subsequent automated delta analysis. the main processing resources for our application using the The automated delta analysis is part of our ongoing research following example, given the following original requirement : activities. Within this work, we focus on the R2BC and its The ECU shall monitor the liquid temperature. sub-ordinance into the broader methodology of an automated The tokenizer splits the text in tokens, like numbers, punc- delta analysis. tuation marks and words. “The”, “ECU” or “.” are tokens The remainder of this work is structured as follows. Chapter within the example sentence. In the following steps, these II provides the reader with the fundamentals on boilerplates tokens can be used to analyze the text in more depth, e.g. with and NLP. We introduce the reader to the R2BC methodology in gazetteers. Gazetteers are used to recognize named entities in Chapter III. Results of preliminary experiments are presented text. They consist of lists with numbers or names of entities, and discussed in Chapter IV. In Chapter V, we summarized like cities, organizations or first names. Once a string in the major publications on related work. Chapter VI summarizes text equals a string in a gazetteer, the named entity can be our findings and gives an outlook on further research activi- assigned. For this purpose, the string in the text receives an ties. annotation called “Lookup”. Another important application II. F UNDAMENTALS is the sentence splitter. This application splits the text in sentences. To this end, a gazetteer list with abbreviations is A. Requirements Boilerplates used to distinguish these from punctuation marks that mark Boilerplates are used to improve requirements quality the end of a sentence. Part-of-speech taggers determine the and to increase the degree of formalization of requirements. part-of-speech of a token and annotate it accordingly. Once A boilerplate is a blueprint that determines the syntactical annotation for all of this information are available, GATE can structure of a single requirement. This predefined sentence invoke JAPE (Java Annotation Patterns Engine) transducers. structure helps to prevent phrasing errors, like the passive This tool searches for a predefined pattern in the text and then form, while documenting requirements. Boilerplates can be annotates this part of the text according to a predefined rule. handled easily by unexperienced authors to write accurate For this search, the information regarding tokens, sentence requirements [11]. The following example shows a company- splits etc. is used by the JAPE transducer [4]. specific boilerplate : In the given example, a pattern described by a JAPE rule The complete system “” could search for the sentence part between “shall” and the shall description. punctuation mark. The action rule would cause this part to be annotated for example with “description”. This boilerplate can be used to document functional re- A second JAPE rule can be used subsequently to search quirements. It consists of editable and non-editable parts. To for the “description” annotation. As a consequence, the document a functional requirement, the author replaces the underlying string “monitor the liquid temperature” of the “” part with the system name and the “description” annotation can be transferred into the editable description part with the function description of the system part of the boilerplate presented in Chapter II-B. Alongside under consideration. The phrase “The complete system” and with the complete system name “ECU”, which would be the modal “shall” constitute the non-editable parts of the boi- recognized accordingly, the conversion would lead to the lerplate. Company-specific boilerplates reflect specific needs following result : coming from the requirements engineering processes applied by a certain company. For instance, the semiformal structure The complete system “ECU” shall monitor the liquid of boilerplate-compliant requirements allows to derive requi- temperature. rements models automatically and to further process these During our previous work [13], we observed that JAPE rules models for requirements verification. Alongside company- work at a higher accuracy when applied to sentences that have specific boilerplates, the well-known boilerplates introduced an equal syntax. Experiences from practical work show never- by Chris Rupp [11] and the EARS boilerplates [9] are used theless that requirements syntax varies a lot. NLP provides in the industry. means to cope with these variances. The pre-processing of requirements in several steps allows to recognize relevant parts B. Natural Language Processing of requirements although presented in different syntaxes and In this work, we use NLP to analyze requirements to transfer these parts into boilerplates. Once all requirements text and to identify specific elements of the requirements are available in boilerplates further NLP can be performed sentence. NLP is a means to the computerized understanding, with high accuracy, e.g. for an automated delta analysis. ASE 2019: 16th Workshop on Automotive Software Engineering @ SE19, Stuttgart, Germany 47 III. R EQUIREMENTS TO B OILERPLATES C ONVERTER In this chapter we present our concept for the Requirements to Boilerplates Converter (R2BC). The R2BC is a prerequisite for an automated delta analysis. For this purpose, we first describe how the R2BC is integrated into the broader me- thodology of the automated delta analysis. The second part of this chapter describes the architecture of the R2BC. Within the third, we explain the natural language pre-processing com- ponent. The concept for the R2BC is presented in the fourth F IGURE 2. R2BC Implementation part of this chapter. We complete this chapter by describing the working methodology for the usage of the R2BC. A. R2BC as Part of a Methodology for an Automated Delta required operations. All mentioned components of the R2BC Analysis are described in the following section, starting with NL pre- processing. We suggest a novel approach for an automated delta analysis as depicted in Fig. 1. The process is triggered once an OEM C. Natural Language Pre-processing submits a CRS to a supplier. During Step 1 of the process, For the NLP we compiled a processing pipeline in GATE, the R2BC is used to convert the OEM natural language which consists mainly of ANNIE [2] resources and several requirements to boilerplates, which are used by the supplier. JAPE transducers. Fig. 3 gives an overview of the applied Once the requirements of the OEM and the supplier have the resources and the process. same syntax our tool called Delta Analyzer (DA), performs the Once the Converter invokes the NL pre-processing com- automated delta analysis (Step 2). As a result, the DA provides ponent, the following steps are performed : a report, which can be used to estimate the necessary effort for the realization of the successor product. 1) The Converter loads the CRS of the OEM into GATE. This document is converted into a corpus, which is the basis for the NLP. 2) The corpus is analyzed by the components Document Reset PR, English Tokeniser, Gazetteer, Sentence Split- ter, POS Tagger, NE Transducer and OrthoMatcher of the application ANNIE. We customized the mentioned components to our specific needs. ANNIE annotates the text and provides an annotated corpus as a result. These annotations provide mainly language-specific informa- tion. F IGURE 1. Methodology for an automated delta analysis 3) During the third step, several JAPE Transducers uses JAPE rules to search annotations in the corpus. These B. R2BC Architecture JAPE rules are defined in advance and are intended The architecture of the R2BC outlines three main com- to search for specific parts of the requirements sen- ponents : a natural language pre-processing component, a tences, which are transferred into the editable parts of converter and a GUI (see Fig. 2). The advantage of the R2BC boilerplates. First, JAPE rules determine, which requi- is its flexibility. OEM CRS submitted to the supplier, are rement fits which boilerplate. Then, certain pieces of diverse in wording, caused by different authors or they differ the annotated requirements are annotated according to in the document formats. the editable parts of boilerplates. All requirements and To cope with this fact, we implemented a natural language other statements in the CRS, which do not fit into pre-processing component (NL pre-processing) into our tool. boilerplates, receive a corresponding annotation. These This component enables a flexible recognition of certain pieces sentences will be transferred into the export document of text and provides the input for the actual conversion into without conversion into boilerplates. boilerplates. The centerpiece of the R2BC is called “Conver- The result of the NL pre-processing component is an ter”, our proprietary development. This component loads the annotated corpus, which is used by the Converter to convert input CRS from the file system of the computer and exports requirements into boilerplates. the CRS with the converted boilerplates to the file system. To this end, the Converter invokes the functionality of the NL pre- D. Converter processing to recognize the necessary parts of the requirements The Converter is the heart of the requirements to boilerplates text and then converts them into boilerplates. The Converter conversion. This component loads documents into the NL itself, is controlled by the GUI, used by the user to perform the pre-processing component and invokes the natural language ASE 2019: 16th Workshop on Automotive Software Engineering @ SE19, Stuttgart, Germany 48 F IGURE 3. NL Pre-processing analysis of the text. Once the NL pre-processing is finished, the Converter gathers the annotated text and searches among the contained annotations for text pieces which are to be converted to boilerplates. During the next step all requirements, which fit the applied boilerplates, are converted. The following two requirements give a simplified example for a conversion : 1) The . . . . .oil . . . .temperature . . . . . . . . . . . . shall be . . .monitored . . . . . . . . . . by the ECU 2) The “ECU” shall monitor . . . . . . . . .the . . . .oil . . . temperature ............. Requirement (1) is the original requirement. In course of the conversion, components of this requirement are rearranged in order and certain words are adapted automatically. Also, at this F IGURE 4. R2BC GUI stage, the user can make adaptation to the conversion results. Due to the fact that some requirements can be converted to several boilerplates, several conversion results for these Review results: During the next step (Fig. 4 Step 2) the user requirements are available. To this end, all conversion results reviews all converted requirements. For each converted requi- are stored by the Converter for the moment. It is the user who rement, the R2BC provides the user with the view of the origi- ultimately decides which conversion alternative is correct. In nal requirement and a view of alternatives for the conversion, our working methodology for the R2BC this step is called called conversion results. In the original requirement view, the “Approve results”. All steps of the working methodology are converted requirement is displayed in its original shape. In described in section III-E. After the user has approved all some cases, it is possible that one requirement fits several results of the conversion the results can be exported. The boilerplates. For this purpose, the R2BC shows all possible Converter exports the boilerplates compliant requirements to conversion alternatives. Hence, the user may select the most a format of choice. Among others, we consider the formats convenient alternative. Conversion results, which are incorrect Word, PDF and the Requirements Interchange Format (ReqIF) can be skipped. The R2BC facilitates the review process by [5] to enhance the work with requirements management tools. highlighting parts of the requirements, which were changed E. R2BC Methodology by the conversion. Changes involve syntactical structure of the requirements sentence as well as for upper and lower case and The R2BC allows a semi-automated conversion of natural changes in word endings. By displaying the adjacent context of language requirements into predefined boilerplates. The user the original requirement in the original requirement view, the interaction with the R2BC follows a four steps methodology : R2BC helps the user to evaluate the accuracy of the conversion convert requirements, review results, edit results and approve result and to select the most convenient alternative. results. We describe this methodology by means of the R2BC GUI depicted in Fig. 4. Edit results: Within the edit results step (Figure 4 Step 3), Convert requirements: The user starts the conversion pro- the user can make adaptations to the conversion results. This cess for a CRS, which is already loaded into the application may be necessary, if a conversion result has a defect. For by pushing the “Start conversion” button (Fig. 4 Step 1). This this purpose, the user presses the edit button next to the triggers an algorithm, which invokes NLP to annotate the corresponding conversion alternative. This activates the editing input CRS. The R2BC browses the created annotations and function within the conversion results view. selects certain parts of the requirements text in the input CRS. Approve results: Finally, the user approves all valid alter- As a consequence, all relevant requirements are converted natives (Fig. 4 Step 4), whereby only one alternative per input automatically into boilerplates by the R2BC. requirement can be approved. ASE 2019: 16th Workshop on Automotive Software Engineering @ SE19, Stuttgart, Germany 49 IV. E XPERIMENTS AND D ISCUSSION without gazetteer support. Within preliminary experiments the This chapter presents the results of preliminary tests and R2BC achieved sound results as presented in Table II. feedback from requirements experts. It is based on the R2BC The conversion of the 88 pages of CRS 1 into boilerplates prototype, which was implemented and tested in an ongoing A and B took roughly 10 seconds. The R2BC achieved a pre- research project by Ritter und Schul [10]. cision of 100% in identifying requirements, which are relevant for one of these two boilerplates. The conversion of identified A. Preliminary Experiments requirements into boilerplate A worked with an accuracy of The aim of the preliminary experiments was to evaluate 62.5%. In the other 37.5% of conversion results, the system the effectiveness of the R2BC prototype. The fully automated name was not converted completely into the boilerplate. In conversion is considered effective, when relevant requirements most of these cases the subject of the original requirement in a CRS are identified and converted into boilerplates ac- consisted of several words. The R2BC converted only part of curately. A requirement is relevant for the conversion, if the subject into the boilerplate. This was caused by the applied a corresponding boilerplate for this type of requirement is gazetteer, which contained several system names, of which applied, e.g. a boilerplate for conditional requirements. We some contained the same words. If for example, the gazetteer assessed the effectiveness of our prototype by calculating contained among others the system names “controller” and precision for the identification of relevant requirements in a “controller module” and “controller module” was mentioned in given CRS. For all identified requirements, we also assessed the original requirement, the gazetteer recognized “controller” the conversion accuracy. For this purpose, we calculated the as system name and cut off “module”. percentage of the number of accurately converted requirements Within the same CRS, the R2BC achieved a conversion of all converted requirements. accuracy of 89.3% for boilerplate B. The conversion accuracy Since in practice, different styles of CRS are submitted was lowered by the diversity of wording, which was applied by OEMs, we have tested our prototype on three CRS from to express conditions. For instance, some authors used “In different requirements authors to increase the significance of the event” instead of “If” or “When”. Also spelling errors the experimental findings. CRS 1 comprises 88 pages. CRS 2 prevented a higher score. For instance, authors did not place is an extensive document with 567 pages and CRS 3 comprises a comma after the condition description in an if clause. 60 pages. All three documents consist mainly of requirements We have also observed that some requirements contained documented as complete and fragmented sentences. several conditions. As a reminder, the R2BC did not use a As already mentioned, many companies define their own gazetteer for the conversion of boilerplate B and still generated specific boilerplates. For our experiments, we have chosen better results, than for boilerplate A. This leads us to the the following two company specific boilerplates A and B : conclusion that the application of a gazetteer can also lead to a disadvantage. In addition to that, we calculated the recall A : The complete system score for CRS 1. The R2BC identified relevant requirements “” shall for boilerplate A with a recall of 100%. The recall score for description. boilerplate B with 66.7% was lowered by the same reasons B : [ELSE] IF , THEN : [the as mentioned before. function “” shall [not]] We applied the same tool setup, except for the gazetteer, description [ELSE : description]. which we adapted accordingly, to the automated conversion Boilerplate A is used to document functional requirements. of CRS 2. The conversion of this extensive document com- B is a boilerplate for the documentation of conditional requi- prising 567 pages was accomplished within 89 seconds. Also, rements. Hence, these two boilerplates determined, which re- for this CRS the R2BC achieved a precision of 100% in quirements were relevant for the automated conversion during identifying requirements, which are relevant for boilerplate A our experiments. It is important to note, that we did not distin- and boilerplate B. The conversion into boilerplate A worked guish between functional and non-functional requirements in with an accuracy of 93.3%. For boilerplate B, the R2BC our experiments. This means, if an identified non-functional achieved a conversion accuracy of 67.6%. These calculations requirement, was converted into boilerplate A, we considered are based on results for the first 400 pages of CRS 2. it accurate, if it syntactically fit the boilerplate. Since CRS 2 is an extensive document, we took the first We used CRS 1 to manually analyze the sentence structure 400 pages as a large sample and refrained from the rest. of the present requirements. Afterwards, these findings were Only 6.7% of the conversion results for CRS 2 had defects. used to define JAPE rules and to determine the settings of other These defects were caused again by the partly recognition of NLP tools, which we implemented in the NL pre-processing the subject of the original requirement, which consisted of component of the R2BC. Except for gazetteers, we applied several words. In contrast, the R2BC achieved a conversion the same tool settings for all three CRS. We applied CRS accuracy of 67.6% for boilerplate B. As for CRS 1, the specific gazetteers for the conversion of requirements into main reasons for conversion flaws are multiple conditions per boilerplate A. The gazetteers were mainly used to identify the requirement and missing commas. The R2BC converted the system name, i.e. the subject in the original requirement. The 60 pages of CRS 3 within 7 seconds into boilerplates. As a conversion of requirements into boilerplate B was performed result, the identification of requirements for the conversion into ASE 2019: 16th Workshop on Automotive Software Engineering @ SE19, Stuttgart, Germany 50 Results Boilerplate A Boilerplate B Further Information Precision of requirements Conversion Precision of requirements Conversion Conversion Pages identification accuracy identification accuracy time CRS 1 100% 62.5% 100% 89.3% 10s 88 CRS 2 100% 93.3% 1 100% 67.6%1 89s 567 CRS 3 100% 100% 100% 84.6% 7s 60 1 The calculations are based on results for the first 400 pages of CRS 2 TABLE II R ESULTS OF PRELIMINARY EXPERIMENTS boilerplate A and B worked for both with a precision of 100%. Also, we presented the future GUI of the R2BC to industry The conversion of identified requirements into boilerplate A experts, as illustrated in Fig 4. The general setup of this worked with an accuracy of 100%. For boilerplate B the R2BC GUI was confirmed by the experts. Among others, experts achieved a conversion accuracy of 84.6%. The reason for recommended to implement a “Clarify with customer” button conversion defects were several conditions per requirement. into the GUI. This button shall allow to store an unclear In summary, the R2BC prototype automatically analyzes requirement in a separate list. This list of unclear requirements large amounts of requirements text and recognizes relevant can be discussed with the OEM after the conversion. To requirements for the conversion to predefined boilerplates. make sure that the actual question regarding this kind of Subsequently all relevant requirements are converted into boi- requirements will not get lost, the experts suggested to add lerplates automatically by the R2BC. Preliminary experiments a dialog box for taking notes, which should appear once show sound results. Especially within the large CRS with the “Clarify with customer” button is clicked. This function 567 pages the R2BC achieved a conversion precision of shall allow to specify the unclear aspect of the requirement 93.3%. This score makes the presented technology promising. or to document a question. In case many notes were taken, Nevertheless, the recognition of sentence parts of the original this would allow the requirements engineer to remember the requirements should be improved. We propose to adapt the questions, when talking with the customer. JAPE rules to cope with multi word subjects. Although our For further improvement, industry experts suggested to JAPE rules already target system names consisting of several implement a functionality that allows the engineer to focus words, evidence shows that the number of tokens to be taken only on those conversion results that are likely to be defective. into account for a system name should be increased. We also As shown by the conversion accuracy scores gained during plan to elaborate the combination of JAPE rules and gazetteer preliminary experiments, most of the conversion results are lists for a proper named entity recognition. correct and therefore do not need further adjustment. Ac- The results, which were presented so far were attained by cording to the suggestion of the experts an algorithm, that the fully automated conversion. However the R2BC methodo- works in the background could calculate the probability of logy is designed as a semi-automatic process, i.e. the user is the correctness of the conversion results, which would last in able to check the results. The conversion results, which we a reliability measure. Hence, all conversion results above a considered incorrect in the above evaluation, in most cases certain threshold would be considered reliable and therefore just require minor adjustments. We presented our methodology would not need to be reviewed. Instead, only those conversion to industry experts. Their feedback and their suggestions for result that have a value below this threshold, should be improvement are described in the following section. reviewed by the requirements engineer. This function would allow to work more efficient with the R2BC. B. Validation with Industry Experts In conclusion, industry experts assessed our proprietary developed R2BC prototype as useful and promising. Their We presented our methodology for the R2BC and our feedback showed, that usability and efficiency are key for a proprietary prototype to requirements engineers. Besides the successful implementation of the R2BC. Our next version of automated conversion functionality, we implemented a selec- the R2BC will implement the presented suggestions for im- tion of the functionality depicted in Fig 4 in our prototype. provement alongside with other features, to serve practitioners The prototype contains a screen for the original requirement best at their daily tasks. and three other screens for the conversion alternatives. A user can interact with the R2BC prototype by using the “Load V. R ELATED WORK document”, “Start conversion”, “Confirm”, “Skip” and “Export results” buttons. Also, it is possible to edit the suggested Arora et al. developed an approach for the automated conversion alternatives manually. This tool concept was as- checking of conformance of natural language requirements sessed by the expert group to be very useful. to boilerplates based on NLP techniques. They introduce a ASE 2019: 16th Workshop on Automotive Software Engineering @ SE19, Stuttgart, Germany 51 generalizable method for casting templates into NLP pattern usability. This will allow requirements engineers, who have matchers. For this purpose, they translate common templates no experience in NLP, to take advantage of this beneficiary into a BNF (Backus-Naur form) grammar. Afterwards, these technology. grammars are implemented as JAPE pattern matching rules for VI. S UMMARY AND OUTLOOK checking template conformance. According to Arora et al. the approach provides a robust and accurate basis for checking In this work, we presented the Requirements to Boilerplates conformance to templates [1]. Converter (R2BC), which is a prerequisite for an automated Farfelder et al. provide requirements engineers with predefi- delta analysis. The R2BC is a novel approach for a semi- ned boilerplates and a domain ontology for the documentation automated conversion of random natural language require- of high-quality requirements during elicitation. To start the ments into predefined boilerplates. To achieve this task, we documentation with DODT, the requirements engineer uses applied NLP and a proprietary developed converter. Alongside the GUI and chooses from a set of predefined boilerplates. the technology, we provided future users with a methodology. Subsequently DODT is accessing a domain ontology, which During preliminary experiments the R2BC prototype proces- contains all available words for the editable parts of the chosen sed large documents with up to 567 pages within seconds boilerplate. The requirements engineer selects the required and achieved high precision in requirements identification words from the list and defines by this the requirements. and conversion accuracy scores. The sound results prove the DODT is based on NLP techniques [6]. effectiveness of our approach. In addition to that, industry Schraps and Bosler present an approach to extract know- experts evaluated our proprietary developed R2BC prototype ledge from software requirements and to transfer it into a and the methodology as highly useful and promising. Our requirements ontology. They use NLP techniques to annotate future activities are focused on the improvement of the R2BC. requirements first. Second, a pattern recognition algorithm To this end, we will use the conclusions from preliminary is searching for predefined patterns within the grammar of experiments and the feedback from industry experts. Above the requirements. As a consequence, all parts of the requi- all, we will focus our effort on the development of a concept rements which fit into these patterns are transferred into and tool support for an automated delta analysis. the requirements ontology. By this approach Schraps and R EFERENCES Bosler are aiming at the elimination of inconsistencies between [1] C. Arora and M. Sabetzadeh and L. Briand and F. Zimmer. Automated specification and software models [12]. Checking of Conformance to Requirements Templates using Natural Fockel et al. describe a methodology for the documentation Language Processing. IEEE, 2015. [2] H. Cunningham and D. Maynard and K. Bontcheva and V. Tablan. GATE : of functional requirements with boilerplates. According to this A Framework and Graphical Development Environment for Robust NLP methodology an overall function is decomposed into its leaf Tools and Applications. In Proceedings of the 40th Annual Meeting of functions. The deployment of boilerplates together with this the Association for Computational Linguistics (ACL 2002), 2002. [3] H. Cunningham and V. Tablan and A. Roberts and K. Bontcheva. Getting methodology leads to a complete model of the requirements More Out of Biomedical Documents with GATE’s Full Lifecycle Open specification. To enable an efficient deployment of the boi- Source Text Analytics. PLOS Computational Biology, 9(2), 2013. lerplates and the methodology Fockel et al. developed a tool [4] H. Cunningham and et al. Developing Language Processing Components support, called ReqPat. ReqPat can be integrated in commer- with GATE Version 8 (User Guide). University of Sheffield, Department of CS, 2014. cial tools like IBM Rational DOORS. This tool does not only [5] C. Ebert and M. Jastram. ReqIF : Seamless Requirements Interchange support the user during the documentation of requirements, Format between Business Partners IEEE-Software, 29(5), 2012. it also tests the quality of the requirements automatically. [6] S. Farfeleder and T. Moser and A. Krall and T. Stalhane and I. Omoronyia and H. Zojer. Ontology-Driven Guidance for Requirements Elicitation. Moreover, ReqPat is able to transfer boilerplates compliant Springer, LNCS 6644, 2011. functional requirements into modeling tools (e.g. SysML/UML [7] M. Fockel and J. Holtmann and M. Meyer. Mit Satzmustern hochwer- tools) [7]. tige Anforderungsdokumente effizient erstellen. In OBJEKTspektrum, None of the presented approaches enables a semi-automated RE/2014, 2014. [8] ISO, IEC, and IEEE. ISO/IEC/IEEE 29148. Technical report, ISO IEEE conversion of random natural language requirements into IEC, 2011. predefined boilerplates. While [6] and [7] present methods and [9] C. Manning and H. Schütze. Foundations of statistical natural language tool support for the documentation of requirements with boi- processing. MIT press, 1999. [10] F. Ritter and A. Schul. Entwurf und Implementierung einer Werkzeugun- lerplates right from the beginning, our experience shows, that terstützung zur sprachlichen Analyse und automatisierten Transformation automotive suppliers receive requirements, which comply to von Projektlastenheften im Kontext der Automobilindustrie. Bachelor different styles of boilerplates or do not comply to boilerplates thesis, FH Dortmund, 2019. at all. Natural language requirements have to be converted [11] C. Rupp and SOPHIST-Gesellschaft für Innovatives Software- Engineering (Nürnberg). Requirements-Engineering und -Management, into boilerplates first, before one can benefit from their semi- Aus der Praxis von klassisch bis agil. Hanser, 2014. formal nature. The R2BC provides a flexible and efficient way [12] M. Schraps and A. Bosler. Knowledge Extraction from German Automo- to convert random requirements into predefined boilerplates. tive Software Requirements using NLP-Techniques and a Grammar-based Pattern Detection. In Proc. of the Int. Conf. on Pervasive Patterns and This is a preparatory stage for machine-readability. As a conse- Applications, 2016. quence, these requirements can be processed automatically in [13] K. Zichler and S. Helke. Ontologiebasierte Abhängigkeitsanalyse im further product development processes, e.g. in an automated Projektlastenheft. In Proceedings Automotive - Safety und Security (AUTOMOTIVE 2017), GI-LNI, 269, 2017. delta analysis. Moreover, the R2BC methodology aims at high ASE 2019: 16th Workshop on Automotive Software Engineering @ SE19, Stuttgart, Germany 52