S. Götz, L. Linsbauer, I. Schaefer, A. Wortmann (Hrsg.): Software Engineering 2021 Satellite Events, Lecture Notes in Informatics (LNI), Gesellschaft für Informatik, Bonn 2021 1 An Approach for Partially Automated Test Generation Based on Signal Recordings Timur Eksen1, Frank Thielecke2 Abstract: The increasing complexity of aircraft systems and the number of functions running on avionics components leads to a rapid growth of the number of required testing. In addition, the mission profile can change over the long life cycle of the aircraft, so that the originally developed tests may not cover all scenarios that occur during the operational use. Therefore, new test strategies are needed which allows tests to be generated without the effort growing along with it. At the same time, the aim is to identify and analyse previously unknown behavioural abnormalities. This paper describes an approach for the semi-automated generation of new test scripts based on recorded data from operations or test and simulation activities. For this purpose, methods were developed to convert these data into a generic format, so that they can be managed and processed with uniform functions. The recordings can be filtered via a graphical user interface (GUI) to derive test stimuli and verdicts. For the generation of the test script an automated process was developed, which is based on the specification for the test description language Generic State Chart extensible Markup Language. Keywords: system testing, test generation, data recordings, automation 1 Introduction Aircraft and their systems are exposed to many different tasks during their life cycle. Even if the requirements are defined and implemented as precisely as possible during the development phase, it is still possible that the end user will use the aircraft for missions and purposes other than those intended. A passenger aircraft becomes a cargo aircraft and an aircraft originally planned as medium-range aircraft is further developed for long- haul flights. Due to the changed use, there may be unknown effects in the operational behaviour that were not considered during development, which can date back several decades, and were therefore not tested. 1 Hamburg University of Technology (TUHH), Institute of Aircraft Systems Engineering, Nesspriel 5, Hamburg, 21129, timur.eksen@tuhh.de 2 Hamburg University of Technology (TUHH), Institute of Aircraft Systems Engineering, Nesspriel 5, Hamburg, 21129, frank.thielecke@tuhh.de Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) 2 Timur Eksen and Frank Thielecke The challenge now is to recognise these effects and to feed them back into the verification process as effectively as possible, so that it can be ensured that the systems are suitable and safe to use with the new scenarios. These scenarios can be identified in various ways. The adapted customer profiles could be specified directly by the operator and shared with the original equipment manufacturer (OEM), whereby the operator would first have to know the original scenarios. On the other hand, the developer could of course also accompany the operation of the aircraft. Both variants require additionally trained employees who would have to take over the feedback of these effects into the test process. An exact reconstruction therefore requires a high expenditure of personnel and time, which in turn is increasing the cost factor that should be avoided. In addition to the changing operational scenarios, the systems are becoming increasingly complex and, made possible by the continuously increasing computational power, new functions are constantly being implemented on the avionic systems. Alongside the pure increase in complexity of individual systems, there is also an expanding networking of former self-sufficient components, whereby many interactions can occur. In order for the test process to withstand these developments, the degree of automation must be increased so that lengthy manual operations can be eliminated. This paper presents a method for deriving tests from data recordings. This should reflect the increasing digitalisation of systems and the growing standardisation of data recording in flight operations. At the beginning, the relevant basics for the following topics are described, followed by the methodological procedure and the implementation of this approach. The methodological procedure is further subdivided into the areas of usable data sources and their management, the derivation of conditions for verdicts based on recordings using a graphical interface and the process for converting the data into tests. Then the application of the method is described by a virtual test and the results of the implementation are discussed. Finally, the contents of this paper will be summarised and an outlook on the following work will be given. 2 Fundamentals For a better understanding of this paper, some fundamentals are initially introduced. First, the test strategy at the Institute of Aircraft Systems Engineering (FST) is briefly described and another publication on this topic is referenced. Then the test description language is introduced into which the signal recordings will be transferred at the end of the process. Finally, related work to this paper is presented. 2.1 Test Strategy Due to the increasing complexity of systems and functions, test methods also need to be further developed so that the number of time and involved personnel does not become a blocking element in the development process. At the Institute of Aircraft Systems Partially Automated Test Generation Based on Signal Recordings 3 Engineering (FST) of Hamburg University of Technology (TUHH), two different approaches are pursued for this purpose, a model-based and a data-driven approach, each of which addresses the problem of test creation and test coverage from a different perspective. Since the research on the two approaches is still ongoing, this paper does not compare them or give a final evaluation. This will be done at a later stage, so only a brief overview is given here. In [HT20] a model-based method for creating test stimuli for operational scenarios was presented. In this approach, a modular scenario model as state chart is build up gradually, from which signals can be generated as test input. The modular approach allows elaborating individual environmental objects in detail systematically without directly editing the overall model, thus reducing the complex environment definition to simpler tasks. The basic system model is kept very abstract and is only specified during the later process. This allows an early implementation of an initial model and early testing. This forward-looking approach aims to identify undesired but realistic scenarios in the development process so that they can be avoided prior to the entry into service. In contrast, the data driven approach takes a retrospective view of scenarios. Although the method allows the usage of recordings from simulation and test runs, the focus is on data from the operational handling of aircraft systems. It is assumed that, due to the high number of flights and the diversity of aircraft operators, a broad mission spectrum can be acquired, under the condition that the required data can be collected during the operation. Since the scenarios cover all performed real-world application scenarios if all flights are actually recorded, the data basis can be used to achieve very broad test coverage. The challenge of this approach, is to extract from the huge number of data those data sets which contain relevant not yet tested behaviour and which do not only represent the nominal case. 2.2 Generic State Chart extensible Markup Language In the testing area, specialized description languages for test cases have been developed, which support many test procedures, but this heterogeneity also prevents easy exchange between project partners and the adoption of existing procedures for other systems. In [Bo18], State Chart extensible Markup Language (SCXML) was selected as the description language for the exchange format based on various application scenarios from the field of aircraft testing. SCXML is an open source standard that was developed by the World Wide Web Consortium (W3C) [SC15]. It is an event-based machine language for the representation of state diagrams based on Harel state charts. The standard contains among other things constructs for the description of parallel and nested state charts, which gives many possibilities for test description. In the research project, Agile-VT the existing standard was further developed with the aim to identify missing constructs needed for the test process and to specify them. This 4 Timur Eksen and Frank Thielecke was done taking into account the intersection of functions given by the participating test system manufacturer dSPACE, TechSAT and Vector. In [Fr19] the requirements for an intermediate format for the exchange between industrial partners and the required structure are considered. 2.3 Related Work The topic of using data recordings for the verification and validation process (V&V) has been the subject of several papers in recent years. The focus of these works was not always on the test process, but in some cases also in comparing existing results with recordings from real world application scenarios. The topic was pursued particularly strongly in the automotive area of Advanced Driver Assistant Systems (ADAS). Even though the direct area of application is different, some of the work will be presented due to the similar objectives. 1) Using Data Recordings for the V&V Process: The work in [La18] describes a method for recognising new scenarios in relation to the existing test set. For the creation of a reduced test set of input vectors, an autoencoder is used which is adapted to the system under test by an initial data set. Through further test runs, additional recordings could be generated, which were separated into sequences of predetermined length and then added to the autoencoder as further training data sets. The distance to the original test set is used to determine which sequences are added to the test set. To avoid repetition, only sequences with the greatest distance are added so that the range of scenarios that can be tested is as wide as possible. Another approach to the applicability of real world recordings in the field of Rapid Control Prototyping (RPC) is presented in [Ba15]. The focus of the work is on the handling of components that not only require a time-dependant input, but also have to react to certain events (e.g. recognition of road signs). For this purpose, an analysis of the dependencies of the input signals is carried out so that they can be grouped into different signal types. In order to create a coherent overall stimulation, these are then coupled to the movement of the vehicle and used in X-in-the-loop simulations. A slightly different approach is presented in [Ro16], as the real world data is not used to stimulate a test system, but as a benchmark for autonomous driving functions in road traffic. Since these should be at least as safe as the average human driving behaviour, this behaviour must be processed as a reference. Compared to the aforementioned works, this paper aims to describe a more generic method to make data from different sources (test, simulation and operational use) reusable in the V&V process. While taking into account the experience gained during testing with specific systems, it will still allow easy use with different systems and system levels. This is also supported by the use of generic SCXML, so that there is no direct dependence on just one test system. Partially Automated Test Generation Based on Signal Recordings 5 2) Test Automation: The challenges and opportunities of automated testing were highlighted in [BWK05] using several projects as example. Although the projects do not come from the field of aircraft systems engineering, but from various software projects (e.g. sales support for insurance companies), many fundamental points can still provide important insights. In the projects, the tests were carried out more effectively when it was possible to carry out tests in different system levels (e.g. unit, integration) and not just limit oneself to certain areas or applications. This allowed test procedures to be integrated earlier into the development and the parallel progress meant that the test cases could be better adapted to the actual needs. Furthermore, it became clear that the processes during automated testing must be captured and documented as completely as possible, as it can happen that components and systems have to be tested again after a new development step. The work described shows that increased automation of all areas of test execution (test creation, configuration, execution and evaluation) can lead to a more error free System Under Test (SUT). However, the best results were achieved when a mixture of manual and automated processes was used, so that the test process is not executed in a black box with no oversight from the test engineer. 3 Methodical Approach for the Test Generation The developed method for test generation is based on four essential steps, so that data recordings from different sources can be used for the testing process: 1. Pre-Processing of the recordings into a generic structure, consolidation of meta data and creation of database entry 2. Selection of the signals to be used to stimulate the System Under Test 3. Selection of the signals to be used as reference for the verdicts 4. Automatic generation of the test script Figure 1: Schematic Structure of the Process 6 Timur Eksen and Frank Thielecke The second step is currently still a manual activity and requires mainly system knowledge and good knowledge of the existing recordings. As this is a rather trivial task apart from the acquisition of the relevant information, it is not described in detail in this paper. In the current implementation, the signals can be filtered using structured query language (SQL) functions and then added to the signal site for test generation (3.3) via a manual confirmation. 3.1 Data Processing and Management In order to work efficiently with the recordings and to allow various functions to access the data, a relational SQL database was set up which serves as the basis for all further processes. Import functions make it possible to use different data sources as a data basis. These convert the recordings from their original format into a generic structure. The numerical values are stored in a matrix and the signal names in an additional cell array. This has the advantage that during further processing there is a clear assignment of the data types to the containers and further conversion steps are omitted. The connection between signal names and numerical data is ensured by the same position in the matrix and the cell array. All other implemented functions can thus have a uniform structure and can act independently of the data sources. So far, import functions for recording formats of the test systems from dSPACE and TechSAT, exported data from the Airbus database SkyWise and data generated in MATLAB/Simulink have been created. By importing recordings from these data sources, three different directions are covered. Data from test and operational runs as well as from simulations can be processed. The recording systems typically use a proprietary data format for the direct storage of the recordings that cannot be read by MATLAB. In order for the import function to read the data, it must be converted into a non-proprietary format in an intermediate step by the use of proprietary tools. With the sources mentioned above, export as a comma- separated-values (CSV) file has proven to be a good solution, as it can be exported by all systems used so far. The structure of the information in the file still shows significant differences but it can be easily processed by fitting adapter functions. The read-in data is not stored directly in the database, but in one MATLAB workspace file per data recording. This has the advantage that no further conversion between the internal data formats of SQL and MATLAB is required when the recordings are loaded during the test generation process. By using the workspace files, MATLAB is able to access individual vectors in the file directly, which results in a great increase in performance, as only a small part of the data has to be loaded into the working memory. During the import process, metadata (e.g. minimum, maximum and mean value) is automatically collected, making it possible to compare the recordings. In contrast to the recordings, this metadata is stored directly in the database, as its data volumes are much smaller than those of the recordings and do not significantly affect the processing performance. Furthermore, the SQL database search and filter functions can be used for Partially Automated Test Generation Based on Signal Recordings 7 the metadata. On the one hand, manual searches for specific data samples can be carried out quickly and on the other hand, the search and filter functions can also be easily automated. The search and filter criteria can be easily combined with each other in SQL, so that from a large selection of data recordings, only those that are needed for the test stimulus or the verdicts remain as the result. For example, all data recordings that contain a certain signal, which exceeds a predefined threshold, could be selected. Afterwards, the result of the filtering could be sorted by date, so that the most recent entries are displayed first. The metadata contains the storage paths to the individual workspaces and the exact position of the stored signals, so that the actual recording can be accessed via this information. The database thus acts as managing interface between the data and all the functionalities implemented in MATLAB. The schematic overview of the developed structure is shown in Figure 1. 3.2 Graphical Selection of the System States to be checked The main task of a test is to verify the correct behaviour of the SUT. In addition to stimulating the system, criteria are needed to determine whether the system behaves as expected. These criteria can be often taken from requirements, for example redundancy conditions. In the data based approach presented here, recorded signals are used as the basis for the verdicts. These signal courses contain the knowledge about normal or abnormal behaviour of the used system. The recorded signals can be those from previous tests or simulations. Therefore, verdicts can be generated if the SUT is used for the first time and no further information exists. The signals in the database are not categorised as input or output signals, so that no additional restriction is imposed for the selection process. The classification as input or output signal is currently only recognisable by the signal name, if a predefined naming scheme was used during recording. For a meaningful signal selection, expertise on the system is needed so that the test condition can generate further insight into the system behaviour. A pre-selection must be made from the database, from all available signals, so that the handling is simplified. Then the number of verdicts needed must be determined and which signals from the pre-selection are intended for which verdict. The assigned signals can then be plotted for each verdict. In the plot, the timestamps from which the parameters for the verdict are to be derived can be selected directly in the GUI. For this purpose, a vector is formed from the values of all selected signals recorded at that time. Figure 2 shows possible signal courses and an exemplary selection of state vectors for the verdicts. The broken lines (v1-v6) mark the selected timestamps at which the values of the signals for the verdict generation are determined. The signals could be, for example, the status (active/passive) of loads in the cabin. The graphic representation allows good overview of the signal course and a precise and easy selection of the state vectors. Finally, these can be adapted to the needs of the test. If several signals are used for a verdict, they can be linked together by logical conditions (Table 1). In addition, an 8 Timur Eksen and Frank Thielecke upper and lower tolerance can be defined for each value individually and independently of each other. As not all required signal states may not always exist in the available recordings, it is also possible to parametrise additional verdicts purely manually. When the verdict is selected, entries for them are created in the GUI with the corresponding parameters. These can also be changed or they can be created without parameters to produce them completely manually. During the test execution, the verdict signals are checked at the selected timestamps. Depending on the selection of the parameters, these must lie within a tolerance range or correspond exactly to the specified value. From the individual results, whether the signal corresponds to the expectations or not, the overall result for the respective verdict is obtained by using the logical connections. A check of the signal curve over a certain period is currently only possible with the use of several verdicts arranged consecutively. Figure 2: Exemplary Selection of Signal States for Verdict Generation Logic Description AND All signals must be within the defined tolerance range. NAND At least one of the signals must be outside the defined tolerance range. (Not AND) OR One of these signals must be within the defined tolerance range. NOR All signals must be outside the defined tolerance range. (Not OR) XOR If two signals are selected, one must be within the defined tolerance range and the other must be outside. (Exclusive OR) Table 1: Logical Conditions for Linking Verdict Conditions Partially Automated Test Generation Based on Signal Recordings 9 Figure 3 shows a screenshot of the GUI for selecting the test stimuli and the verdicts. At the top left, all signals that have been preselected as stimulus or verdict reference are listed in tabular form. The plots at the bottom left represent these graphically, whereby the upper plot contains all signals for the stimuli and the lower plot contains the signals for the verdicts. For the verdicts, only the signals are shown that were assigned for the specific test step. In the overview, to the right of the plots, the desired test step can be selected and values taken over from the graphic view can be edited manually there. Figure 3: GUI for selecting the test stimuli and the verdicts 3.3 Automated Generation of Test Scripts A test script can be automatically generated from the selected stimulation signals and verdict conditions. In an intermediate step, the data is structured in a way that fits the test description language specification. Since the specific structure of the selected test description language is only taken into account during the export, it is possible to implement export functions for other languages without having to modify other sections 10 Timur Eksen and Frank Thielecke of the method. In this case, the export was implemented for generic SCXML, which was further developed in the project. To enable the stimulus signals to be converted with the commands available in generic SCXML they are classified first. The goal is to select whether the signals are mapped identically to the original in SCXML or whether an approximation is useful. For this purpose for each signal, it is determined how often the signal value changes in comparison to the previous data point (e.g. two value changes for a rectangular function). If the rate of change is low in relation to the length of the signal vector, it can be easily mapped using set commands. For signal sequences with many value changes (e.g. sine waves), this method leads to extremely long scripts due to the underlying SCXML syntax. The size of the script also increases the effort to translate and compile the script for the different test systems and at the same time reduces the auditability of the test script. The more severe this becomes, the more difficult it becomes for the test engineer to detect and check deviations during the test execution. To eliminate this problem, a possibility for approximation of the test stimulation has been integrated. One objective in selecting the approximation method was that the method should be able to use commands from the generic SCXML specification. This was intended to avoid that the representation of the approximation in the SCXML script increases the size. The implemented algorithm is a piecewise linear approximation (PLA) based on the sliding window method [Ke01]. Starting from the first data point, a linear substitute segment is set up, which first goes to the following data point. For this segment, the local error is determined and if it is smaller than the maximum specified error, the segment is extended gradually. This is done until the threshold is exceeded, which is also the beginning of a new segment .The root-mean-square error (RMSE) is used to calculate the difference between the original signal and the approximated segment. These segments are parametrised in the test script as generic SCXML ramp function. For the test generation, the algorithm was implemented in such a way that the tester can influence it significantly by three parameters (Table 2). If one parameter was selected as the main criterion, the other two parameters are variable, so that the approximation can achieve the main goal. For example, if a maximum number of states is given, then an approximation is created starting with an initial value for the deviations. After the first run, the local error is adjusted by a certain percentage, depending on whether the maximum value was exceeded or not. If the maximum number of states has been exceeded, than a larger deviation is allowed, so that fewer ramps are used. If the maximum value is undershot, in contrast, the local error is reduced until either the maximum value would be reached on the next pass or if there are no more significant changes in the number of states between two passes. Partially Automated Test Generation Based on Signal Recordings 11 Parameter Description State Limit Maximum number of allowed states the test may have. Corresponds to the maximum number of ramps or segments into which the stimulation signal is broken down. Local Error Maximum deviation between the ramp as approximation and the original segment. Global Error Maximum deviation between the complete approximations obtained from all ramps compared to the original signal. Table 2: Parameters of the Approximation Algorithm Either after the representation has been determined for all signals as a set or as a ramp command and the corresponding parameters have been determined, a consolidation of the required states takes place. Since the commands for each signal were determined individually, it can happen that several command calls have to take place simultaneously. These are bundled in a common state so that no additional parallel path is created (Figure 4). This simplifies the implementation of the generation process on the one hand, and on the other hand, the test procedure is easier for the user to understand. This is especially important for automatically generated test scripts to increase the acceptance of it. Figure 4: Summary of Stimulation Commands into Common States The template according to which the test script is generated is shown in Figure 5. The states startup and teardown are used to set the system to the appropriate initial state or to reset it after the test is finished. With the data based approach, the first state vector is taken from the recording. It is assumed that the recording does not start directly with the abnormal behaviour, but that the nominal case is still present at the beginning. Since this is not always true, the activation of these two states is optional. The actual stimulation of the test system takes place in the compound state, which is called logic. Based on the conversion of the selected signals, as one-to-one conversion or as an approximation, the 12 Timur Eksen and Frank Thielecke required commands are converted into SCXML states. The timing is implemented by the use of delayed events, so that the transition to the next state is only achieved after the delay has elapsed. The verdicts are integrated into the timing of the test script equivalent to the stimulus commands. Table 3 shows an exemplary process description of three signals that are to be converted into an SCXML script. The signal values at time zero are used by default as the initial values of the test run and are set in the startup state. The following signal changes are written to the logic state of the test script. In this example with three points in time, the states ‘logic_c0_c0’, ‘logic_c0_c1’ and ‘logic_c0_c2’ are created for this purpose, which are named according to the naming scheme defined in the Agile-VT project. Within these states, the commands (set and ramp) for the individual signals are integrated. After 0.5s, signal A is set to the value 10 and a wait command delays the transition to the next state. At time 1s, signal B is stimulated by a ramp function, whereby the values in the vector specify the slope, the start value and the execution time of the ramp. In this case, the ramp starts with the value five and then runs for 10s with a slope of one until the value 15 is reached. The last table entry shows the case where two signals are changes at the same time. Time [s] Signal Value Command 0 Signal A 0 Set Signal B 5 Set Signal C -2 Set 0.5 Signal A 10 Set 1 Signal B [1, 5, 10] Ramp 3 Signal A 0 Set Signal C [-2,-2,15] Ramp Table 3: Exemplary Process Description for the Test Case Generation Partially Automated Test Generation Based on Signal Recordings 13 Figure 5: Template for Generation of Test Scripts in Generic SCXML 14 Timur Eksen and Frank Thielecke 4 Verification of the Method The developed approach for test generation based on recordings was tested on a virtual oxygen test system of the A350. This system offered the possibility to represent real processes in real time and to include manual interactions during the runtime. This makes it easy to create different procedures to test the method, and the manual interface creates opportunities to quickly change the system state to see if the conditions detect system malfunction. For the verification of the method, a logical operational sequence was created together with the project partner Airbus as a use case in a simulation environment. The simulation execution and the recording of the data was performed using tools of the test system manufacturer TechSAT. During the recording, the sequence of actions was performed manually via a control GUI and consisted of the following steps: 1. Activation of the oxygen supply to the captain 2. Activation of the oxygen supply for the first officer and the third crew member 3. After a phase of normal oxygen consumption, the pressure in the active cylinder is reduced to such an extent that the switchover to the reserve cylinder is forced Using the presented approach, signals from the recording were selected for stimulation so that the test steps described above, originally executed manually, could be generated by the test script. From other recorded signals the verdicts were derived, by which the success of the executed sequence could be determined. The generated generic SCXML script was then executed on the test system. The correct execution was verified by several measures. During the running test, the functional flow was monitored by the verdicts and the visible behaviour in the GUI. In addition, a new recording of the test was made, which was compared to the original recording and which allowed the correct timing to be checked. Based on this use case it could be shown that the presented method for databased test generation is successfully applicable. The partly automated derivation based on existing recordings has significantly accelerated the generation process and it is possible to execute for users without knowledge of the test description language. Therefore, existing sequences from previous activities (test, simulation or operational activities) can be easily returned to the test operation. During the development and testing of the method, limitations became apparent that must be taken into account when selecting recordings and signals. Depending on the structure and configuration of the test bench, not all signals that have been recorded are suitable for stimulating the system. When selecting the signals, care must be taken to ensure that only input signals from the associated systems are used to avoid illogical stimulation. This can happen if the signals have a name throughout the simulation and it is not possible to see at which point of the system or subsystem the signal value was recorded. This problem can be avoided by a clearly defined naming convention, as it is Partially Automated Test Generation Based on Signal Recordings 15 always traceable to which system the signals belongs and whether it is an input or output signal. Furthermore, it may be that the signals are set actively from other simulations. Although the test script stimulates the signals, they are immediately overwritten so that the desired behaviour cannot be achieved. When selecting the signals, it is important to make sure that these signals are not selected for stimulation. This can be done by addition of metadata about the signals or existing system knowledge of the engineer. Furthermore, it has been shown that depending on the executing system and the settings, some system processes are sampled periodically. Among other things, this can lead to the fact that the activation of a button may not be recognized in the GUI. This GUI can be used to activate errors or change parameters and system states of the oxygen system. The actuation sequence (0>1>0) sets the value to one for only a few milliseconds, which corresponds to the real recorded process, where the actuation coincides with the cyclical query of the value in the GUI. To ensure the correct execution, either the GUI interaction must be longer than the cycle or the actuation sequence must be automatically extended during test generation. For both variants, specific system knowledge is required to identify the problem and to judge that the adjustments do not distort the desired test procedure. 5 Summary and Conclusion This paper presented an approach for the semi-automated generation of test scripts based on recordings of previous simulations, tests or operational activities. The presented method has shown with which structure of data processing and management it is possible to take different data sources as basis for the test generation process. A software framework based on MATLAB and a SQL database was developed, which allows executing generic functions without direct dependencies to the original storage format. Afterwards the graphical approach for the selection of verdict conditions, the implementation as MATLAB GUI and the settable parameters were presented. Furthermore, the automated generation process was explained, which can convert recordings into test scripts based on the selected stimulus signals. The challenge was made clear that depending on the test description language, further preparation of the data could be useful. For this purpose, a categorisation of the signals was carried out, whereby the highly fluctuating signals are converted into stimulation commands via an approximation. The automated generation process has demonstrated the fast adaption of new signal recordings for test procedures, as the time consuming manual formulation of the sequences is no longer necessary. This process also ensured that all test scripts are generated without errors and that they are conform to the specification of the test description language. As long as test environment (e.g. test bench, configuration) and test script fit together, the operability can be assumed with a high degree of certainty. 16 Timur Eksen and Frank Thielecke It has also been shown that some knowledge about the involved systems is still required to be able to carry out the test and achieve meaningful results. On the one hand, this is to the correct selection of stimulation signals so that they are compatible with the test system, and on the other hand, stimuli must be selected that produces a relevant scenario. 6 Outlook The presented approach has presented the basic procedure for deriving test cases based on data recordings, but many of the described procedures are still based on manual activities of the test engineer. These steps are to be further developed in future activities so that they can be automated or that the activities of the test engineer can be supported by providing further information. Especially the selection of the data sets to be used for test generation currently requires a precise knowledge of the recording processes in order to generate a meaningful stimulation. Since operational data is also used for this process, it can be assumed that the required knowledge will not always be available due to the large number of different recordings. Therefore, it is essential to integrate analysis algorithms into the selection process, which will create a pre-selection of the available data sets. Assuming that the nominal case has already been tested extensively during development, it will then be the task of the analysis procedures to identify the recordings in which abnormal behaviour has occurred and to test it more precisely afterwards. For this purpose, different pattern recognition algorithms and approaches will be investigated by which nominal and abnormal behaviour patterns can be marked in the signal data. Afterwards, it is planned that all newly imported recordings will be automatically examined based on the known patterns and that conspicuous data sets can be thus highlighted for the test generation. During the development, so far it became clear that test engineers only accept automated process steps if the process is traceable and if it is possible to intervene in it if necessary. Therefore, it is planned to automatically create a documentation of the process for each test generation. This should record all settings whether they were set automatically or whether the tester set them manually. For this purpose, intervention points must be created in the generation process so that they can be checked and, if necessary, adjusted at relevant points (e.g. when determining the maximum deviation of the approximation). The implementation of the automatic documentation also examines the extent of which it can fulfil requirements for the certification process. 7 Acknowledgement This paper is based on research work done in the course of the Agile-VT project (contract code: 20X1730J), which is funded by the Federal Ministry of Economic Affairs and Energy in the national LuFo V-3 program. Partially Automated Test Generation Based on Signal Recordings 17 8 Bibliography [HT20] D.Hillig, F.Thielecke, „Approach to Systematic Test Signal Definition for Operation Scenarios of Aircraft Systems“, 2nd Workshop on Avionics Systems and Software Engineering (AVIOSE’20), Innsbruck, Austria [Bo18] Timm, Bodo et al. (2018):“Schlussbericht zum Vorhaben STEVE System-Technik und virtuelle Erprobung: Förderprojekt im Rahmen des Luftfahrtforschungsprogramms LuFoV-1: Laufzeit des Vorhabens: 01.01.2014-30.09.2017“ [SC15] World Wide Web Consortium (W3C). (2015) State chart XML (SCXML): State machine notation for control abstractions. [Online]. Available: https://www.w3c.org/TR/SCXML/ [Fr19] Franke M., Meyer V.HW., Rasche R., Himmler A., Thoben KD. (2019) Interoperability of Test Procedures Between Enterprises. In: Popplewell K., Thoben KD., Knothe T., Poler R. (eds) Enterprise Interoperability VIII. Proceedings of the I- ESA Conferences, vol 9. Springer, Cham. [Ke01] E. Keogh, S. Chu, D. Hart and M. Pazzani, "An online algorithm for segmenting time series," Proceedings 2001 IEEE International Conference on Data Mining, San Jose, CA, USA, 2001, pp. 289-296 [HT19] M.Halle, F.Thielecke, „Tool Chain for Avionics Design, Development, Integration and Test“, 1st Workshop on Avionics Systems and Software Engineering (AVIOSE’19), Stuttgart, Germany [La18] J. Langner, J. Bach, L. Ries, S. Otten, M. Holzäpfel and E. Sax, "Estimating the Uniqueness of Test Scenarios derived from Recorded Real-World-Driving-Data using Autoencoders," 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, 2018, pp. 1860-1866, doi: 10.1109/IVS.2018.8500464 [Ro16] C. Roesener, F. Fahrenkrog, A. Uhlig and L. Eckstein, "A scenario-based assessment approach for automated driving by using time series classification of human-driving behaviour," 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), Rio de Janeiro, 2016, pp. 1360-1365, doi: 10.1109/ITSC.2016.7795734 [BWK05] S. Berner, R. Weber, R. Keller. (2005). Observations and lessons learned from automated testing. 571- 579. 10.1109/ICSE.2005.1553603. [Ba15] Bach, J., Bauer, K., Holzapfel, M., Hillenbrand, M., & Sax, E. (2015). Control based driving assistant functions' test using recorded in field data.