An approach for eliciting requirements from digital sources in organisations using the Scrum method Stylianos Georgiadis1,2, Jelena Zdravkovic2 and Janis Stirna2 1 European Dynamics, 15125 Maroussi, Athens, Greece 2 Stockholm University, Kista, SE-16 407, Sweden Abstract The business world is nowadays characterized by complexity due to rapidly evolving market and customer requirements. As a consequence, software providers are facing the challenge of delivering products with higher pace and innovation. The agile methodology has a big impact on how software systems are developed - it should facilitate business value in short iterations. Requirements are the base of all software systems, and consequently, Requirements Engineering (RE) plays one of the most important roles in system development. Traditional elicitation techniques relying on stakeholders’ requests do not cover the increasing demands for considering unintended data from organisations' related digital sources, internal (transaction logs, sensors) or external (e.g., microblogs), amplifying thus the need for the elicitation of data-driven requirements. This study proposes a process that combines data-driven and traditional RE approaches for Agile software development, and specifically for the Scrum method. The process intends to assist Agile professionals to elicit requirements from digital sources in combination with intended data derived from the stakeholders without impacting the main Agile practices. The motivation for the research origins from the case studies carried in few companies having the challenge to include data-driven requirements into their Agile approaches. The usage of the proposal is illustrated on an enterprise software case, while several Scrum professionals were interviewed to evaluate its correctness and importance. Keywords Enterprise Agile Frameworks, Data-Driven Requirements Engineering, Scrum, User Stories1 1. Introduction Today’s dynamic business environment requires flexibility for organizations to endure and evolve. The agile methodology is increasingly considered for enterprise system development to satisfy dynamic and demanding customers’ needs and thus remain competitive in the market share [1]. Agile methods, such as wide-spread Scrum [2, 3], are highly iterative and incremental, and where the development team works in a close collaboration with the customer [4, 5]. Agile methods argue that system requirements evolve so rapidly that the focus must be set on the implementation as soon a change is requested. Pohl argued that requirements elicitation is the main activity of RE, where its first sub-activity concerns identifying relevant sources for eliciting requirements within a system’s context [6]. If the relevant sources are not identified properly, the requirements specification for the system becomes incomplete. In the traditional elicitation approaches, requirements are derived from human stakeholders as the main source. In agile methods like Scrum, the information related to the system to be developed is collected during interviews between the agile team and system’s stakeholders, and then requirements, i.e., user stories, are created [3]. The requests for system’s changes are intensely increasing due to a vast emergence of digital data sources, as well as due to the ability of the users to give online feedback within hours or even minutes. Companion Proceedings of the 16th IFIP WG 8.1 Working Conference on the Practice of Enterprise Modeling and the 13th Enterprise Design and Engineering Working Conference, November 28 – December 1, 2023, Vienna, Austria stylianos.georgiadis@eurodyn.com (S. Georgiadis); jelenaz@dsv.su.se (J. Zdravkovic); js@dsv.su.se (J. Stirna) 0000-0002-0870-0330 (J. Zdravkovic), 0000-0002-3669-832X (J. Stirna) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Owing to this phenomenon, the interest in considering digital data as new sources for requirements acquisition in addition to traditional - stakeholder-driven, has significantly increased. Data-Driven Requirements Engineering (DDRE) has become an emerging sub-discipline where the requirements are gathered from vast digital sources such as microblogs, reviews, electronic documents, or sensor- readings and computer logs, at the same time enabling increased automation in requirements elicitation [7, 8, 9]. Dynamic data offers new opportunities to leverage a broader range of user requirements. There is hence the interest for the development methods for managing big and rapidly growing volumes of data to contribute to a continuous evolution of enterprise software systems. Focusing on data from digital sources enables the elicitation of up-to-date user requirements, which in turn improves customer satisfaction. Even the inclusion of massive amounts of digital data bears challenges for all the actors involved in a software development project, it also strengthens the interaction between them - analysts, developers, stakeholders, and especially end-users. The data from digital sources most often is not intended for the purpose of requirements elicitation; therefore, it lacks structure and completeness. A number of studies have addressed this issue [9], by proposing the methods to link heterogeneous digital data to some initial requirements artefacts. However, there is still a lack of knowledge of how these efforts can become a part of and assist to the agile methods in use and their practitioners, especially since the actors related to agile methods are not able to interact with the authors of the proposed requirements, nor they can control the pace of the incoming digital data. Because the Scrum method has become wide-spread in agile development, and in addition - the first author of this study is a Scrum practitioner, the problem concern has been set to the lack of an effective process for managing data-driven requirements in a Scrum environment. The purpose of this study has been to provide a contribution to the above-described challenges and the problem; therefore, the following research goal has been explicated - How data-driven requirements engineering can be used in a development project following the Scrum agile method in parallel with stakeholder-driven requirements techniques? We have aimed to, using Design Science Research [10], design an applicable process for combining the management of data-driven and stakeholder requirements in a Scrum agile environment. The suggested solution intends to assist Agile professionals to elicit requirements from digital sources in combination with intended data derived from human stakeholders without impacting the main Agile practices; i.e. it should be an aid to the roles of the Scrum Master, the Product Owner, and the Development Team, in acquiring user stories from digital sources and transferring them to the development, while cooperating with existing stakeholders and processing the data with a higher degree of automation. The proposal is evaluated by several agile experts and demonstrated on a real- life case, but it has not yet been fully implemented in an agile project. The remainder of the paper is structured as follows: in section 2, we present the background for this this study and related works. Section 3 presents our main artifact – an integrated Scrum process for dealing with both stakeholder and data-driven requirements. In section 4 we present a summary of the artifact’s evaluation. Section 5 demonstrates the artifact on a real business case. A discussion of the results, concluding remarks, and future work, are presented in section 6. 2. Background and related work 2.1. Scrum method The Scrum method is based on the pillars of transparency, inspection, and adaptation [3]. Throughout an emergent development process, the work and included tasks should be visible to everyone related to the project. Inspection is essential to detect deviations or undesirable variances, while the product must always be adjusted to minimize any deviations and maximize the product’s value. Scrum structures the development into work cycles (i.e., iterations) known as sprints. A sprint is a dedicated period of time in which a set amount of work is to be completed (2-4 weeks), with a strict start and strict end-dates. The method recognizes four main roles participating in the system development: Stakeholder, Product Owner, Scrum Master, and the Development Team [11]. Stakeholder collaborates closely with the other roles throughout the entire project, starting from providing requirements for the system. Product Owner owns the system under development and is responsible for making all the decisions affecting that progress including the creation of the product (system) roadmap, identification of requirements, development, incremental improvements, and maintenance. Scrum Master is responsible for upholding the Scrum method, i.e., that the Scrum process is correctly and efficiently followed, that its tasks are feasible, that the project participants are focused on the project goal, and that potential obstacles in the team and the process are resolved. The Development Team is responsible for transforming obtained requirements into a working product (system). In Scrum, there are three major artefacts: the product backlog, the sprint backlog, and the product increment. The first contains all the requirements that need to be implemented by the Development Team. The main responsible for the product backlog is the Product Owner and in collaboration with the rest of the Scrum team this role makes sure that the correct requirements (often referred as items) are developed and implemented [11]. The sprint backlog collects the items from the product backlog that are forecasted to be completed in an ongoing sprint. It is planned by Scrum Master and for the Development Team, to achieve an actionable plan for delivering the product increment [3] – i.e., the outcome which can be delivered to the customer without any additional work. The requirements are in Scrum called user stories. According to [1], a user story describes the functionality that provides a value to either an end-user or a system’s owner and consists of a card containing the brief story description created by Stakeholder (i.e., the main requirement statement); a conversation contains the details discussed through the process between the Stakeholder and the Development Team; confirmation contains the conditions to be tested by the team in order to verify that the user story is developed as expected. Every user story card follows the template ‘As a , I want , [so that ]’ and it is the most important requirements elicitation representation practice in the Agile software development process [12]. All elicited user stories are grouped in the product backlog and further through the sprint backlog assigned to the development team for a next sprint based on the priorities of the project. The development process in Scrum is steering a number of relevant tasks (aka actions), where the major ones include: the start, when a Stakeholder decides to create or upgrade a system by creating a user story; next actions are to, in the collaboration with the Scrum team, review the user story for completeness, rank its priority, illustrate it (if needed), if too big - split it into smaller user stories, and finally – place the user story in the product backlog; further is the user story planned for the development, i.e., it enters the sprint backlog; from there the Development Team is fetching it into a sprint, developing, and then demonstrating to the Stakeholder for, upon feedback, placing it to a next sprint or delivering an executable version for implementation (i.e., product increment). In addition to the presented concepts of the Scrum method, collaboration boards are commonly used to, by a visualisation tool (digital, or a physical whiteboard) help teammates understand how the user stories are advancing in their development and thus facilitate communication and operative decision-making [13]. 2.2. Data-driven requirements elicitation Interviews, focus groups, and workshops are the main sources of conventional RE [6]. Recently, organizations have been collecting user feedback through digital sources such as social media, user forums, or even review systems [7]. Software products’ success likely depends on user feedback by providing high rates or positive comments. On the other hand, negative comments and low ratings may affect the sales numbers and the product’s reputation. Data-Driven Requirements Engineering (DDRE) takes advantage of a large amount of data retrieved either from user feedback in the form of natural language or machine-generated sources [8, 9]. [9] considered emerging dynamic data sources as possible sources of requirements and categorized them into one or a combination of human-sourced data sources, process-mediated data sources, and machine-generated data sources. Human-sourced data sources refer to digitized records of human experiences. Some examples of human-sourced data sources include social media, blogs, and content from mobile phones. Process-mediated data sources are the records of business processes and business events that are monitored, such as electronic health records, commercial transactions, banking records, and credit card payments. Machine-generated data sources are the records of fixed and mobile sensors and machines that are used to measure events and situations in the physical world. They include, for example, readings from environmental and barometric pressure sensors, outputs of medical devices, satellite image data, and location data such as RFID chip readings and GPS outputs, or data from computer systems such as log files [14]. Even if the data was not initially intended for use in RE, it still provides some essential information from which important change requests can be derived [15]. In [16], the authors emphasised the volume, velocity, and variety of Big Data as the main influential factor for scaling requirements management; they designed and developed a model-based semi-automated process to elicit candidate requirements from digital data sources. [17] stated in their study that the agile organisations that have relevant heterogeneous data sources could benefit from having the semi-automated elicited requirements directly to their backlogs. Further in [18], the authors argue that the volume, dynamics, and variety of digital data cause the elicitation of requirements to become even more iterative and towards continuous, but also complex and unstructured, which current agile methods are therefore unable to manage in a structure and efficient manner. In our study, we have focused the effort to contribute to this challenge by proposing a synergy of stakeholder and data-driven requirements elicitation and development in the scope of the Scrum process. 3. Integration of data-driven requirements to Scrum method 3.1. Research approach Methodologically, the study follows Design Science Research [10], an approach fostering incremental and iterative development of research artifact development in the IS domain, following the main steps that guide problem identification, design, development, demonstration and evaluation. This study is a part of a larger DSR project which started by carrying out two case studies, one in the company involved in game development and the other in online banking business, to identify the problems concerning methodological support for data-driven requirements elicitation [18]. The two companies have in common that their services rely highly on the preferences of customers, whose number is up to several million; they showed therefore a high interest for integration of digital data (forum blogs, social media, transaction and user logs, etc.) for elicitation of requirements and for integrating these into their agile development approach. During observations and interview sessions, in a summary, both companies (i.e., Product Owners, Developer Teams, CTO), provided some important insights concerning a lack of a structured method for supporting the company’s needs to optimise development resources when dealing with the requirements originating from massive online sources, removal of individual biases for requirements prioritization, and for enabling more rapid system releases. In this study we have continued the research by taking the focus on a structured proposal for integrating DDRE in the Scrum agile method. We have designed an artifact that integrates elicitation of stakeholder and data-driven requirements into a single Scrum development process (section 3.2). Further, we did a semi-structured evaluation (section 4) and performed a demonstration with two illustrative cases (section 5). 3.2. An integrated process for data-driven and stakeholder requirements The design of the process started by sketching the main tasks (i.e., actions, according to Scrum) for representing the workflow for the stakeholder-driven requirements elicitation in the Scrum environment. Then, the process was expanded to include and combine the actions relevant to the management of both data-driven and stakeholder-driven requirements, depicting also the Scrum roles interacting in each action, and the related Scrum artifacts (Figure 1, below). This effort was further complemented by the design of a corresponding collaboration board for presenting the progression of the user stories in each of the actions of the integrated process (Figure 2, below). In the traditional Scrum method, the main roles have the responsibilities as they are described in section 2.1. However, when user stories are derived from digital data sources in addition to stakeholders, the responsibilities of the roles are changed, extended. The unintended data from digital sources are constantly updated. The Scrum team needs to know how to handle potentially a huge amount of new relevant information and how often this information should be taken into consideration as a candidate item to the product backlog. On one hand, user stories from unintended data reduce stakeholders’ workload but on the other hand, may cause noise or even deviation from the project goal. It is further important to keep the sprint scope adjustable in a way that will not affect the workload of the development team and the quality of the final software product. Another important challenge is the review and adjustment of the user stories derived from digital sources. To distinguish, review, combine, or exclude potential user stories is a complicated task to be executed in such a short time. The table below summarizes the restructured roles’ responsibilities: Table 1 The responsibility of the main Scrum roles in the integrated requirements elicitation Role Responsibility Stakeholder Remains the author of his/her user stories. However, user stories derived from digital sources require analysis, assessment, and completion since most of them are ambiguous, vague, and incomplete. Because in most cases the author of a comment on online forum is unreachable, the Stakeholder must complete the user story, help in assessing the risk of the respective user story, help in identifying possible relations to some other user stories in the product backlog, and provide clarifications to the Development Team whenever needed. Scrum Master As the user stories are retrieved both from the digital stories and stakeholders, the complexity and magnitude of the Scrum practices and actions is increased. To ensure that all Scrum events are kept within the timebox, Scrum Master needs to provide concise and clear product backlog items and facilitate stakeholders’ collaborations as requested. The user stories from post forums or tweets will be continuously created, therefore, he/she should remove the barriers and keep the pace for the whole Scrum team without adding extra concerns or work to the other members. Product Owner The role is responsible for the product backlog and for ordering the items within it. With user stories from digital sources, the product backlog is continuously increasing with more and more user story items. The new items must be communicated to the stakeholders, especially since they are not created or requested by them. The Product Owner is accountable for the management of an effective product backlog; and the communication with the stakeholders is essential to achieve the product goal. Development The team is enriched with data analyst responsible for developing and maintaining processing of raw Team digital data to candidate user stories. The team stays responsible for creating the sprint backlog, now including also data-driven user stories, hence developers review the provided input and assess the potential impact on the existing architecture without affecting the agreed plan, to achieve each sprint goal. When developers need clarifications about the data-driven user stories, stakeholders reply to them as the author of unintended data cannot be tracked. Consequently to the changed responsibilities of the Scrum roles, the elicitation of the user stories from digital sources impacts the Scrum activities. The process model in Figure 1 intends to provide the information on the key tasks regarding the creation of user stories, their assessment as well as the implementation during the sprints. The process is described from the perspective of the main Scrum aspects: actions, artifacts, and actors. The white-coloured symbols in Figure 1 represent the elements that are done the same as in the traditional Scrum approach (section 2.1); the purple colour represents the newly added elements for supporting data-driven requirements, while the yellow- coloured elements depict the changed traditional Scrum elements due to extending the process to the elicitation of data-driven requirements. Figure 1: Integrated Scrum process for the elicitation of user stories from digital sources, in addition to stakeholders’ user stories The process beginning is determined by the incoming source type of the information. In case the source is a stakeholder, the process starts with and initial user story created by him/her as the main author; or, when some information arrives from a digital source, first data processing action is executed, following the approach presented in [16]; as a result, a candidate user story is obtained – containing a partial requirement description and a possibly priority (i.e., when applicable - from influence-related data). For event-driven sources, e.g., microblogs, computer logs, sensor data, the data may be collected continuously, either in near real-time or in batches at defined time intervals, from a location, a timestamp is set, and the status that it is fetched [16]. For human-sourced data (Section 2.2), which is mostly unstructured, Natural Language Processing (NLP) is required to extract relevant information. The analytical tasks for this source include: Classification, Sentiment Analysis, and Named Entity Recognition (NER). The outputs of these tasks are associated with a Segment, which allows for the body of the NL data to be divided into smaller units, such as sentences. Each processing step within the action data processing is associated with an algorithm or a ML model that was used to achieve a particular data transformation. The action can often be fully automated, however Data Analyst (from Development Team) role is responsible for developing or applying needed algorithms and training of ML models, as well as for monitoring the execution of the processing (details are elaborated in [16]). Once a user story candidate item is obtained, Product Owner leads the reviews (section 2.1), but specifically for the data-driven user stories additional analysis is needed by Scrum Master and Stakeholders because that these items are often ambiguous and incomplete. If the analyse user story shows that it is not feasible to proceed with it (for quality, technical, or other reasons), it is discarded. Otherwise, the user story will proceed as Stakeholder’s user stories to be ranked, illustrated, and split into smaller user stories if needed. During the analysis it is also important to compare the user story with existing backlog items, for possible similarity, because the data-driven items may be processed in high pace and amounts, compared to those of stakeholders. The figure below depicts the changed management the user stories from the perspective of the Collaboration Board (section 2.1): the yellow-coloured elements depict stakeholder user-stories, and the purple-coloured data-driven ones. After the Analysing activity, all the user stories either derived from digital sources or the stakeholders are proposed to be treated the same (grey-depicted). Figure 2: Collaboration board depicting the flow of stakeholder and data-driven user stories Before the beginning of each sprint, as in the original Scrum process, the Product Owner, the Stakeholders, and the Scrum Master decide which product backlog items should be included in a sprint. During this phase, the Development Team identifies any changes necessary to complete the implementation of the sprint backlog items as well as refines the system architecture to support the new user stories (recall even section 2.1). After the completion of this second phase, the iterative cycle of development work starts – i.e., sprints are executed. A number of meetings, some of which are daily stand-ups, occur during a sprint to review and discuss whether there are any new requirements to be added to the sprint backlog, which becomes critical now when the list of possible items includes even the requirements derived from digital sources. The development Team gathers and consolidates the information retrieved from the meetings (i.e. adjust) either with the Scrum Master or with the Stakeholders (final clarifications may be required, especially for the user stories from unintended sources). Apart from the extension of the roles’ responsibilities and main activities, the artefacts themselves are impacted by the inclusion of digital sources of requirements. The management of both product backlog, even sprint backlog, will be intensified, and the size will be enlarged due to the additional requirements from digital sources. Further, these user stories may not be feasible to be implemented within the period of one sprint because, as not being framed into the user story format from the beginning, they may depict broad expectations for improvements (i.e., epics) and therefore often needed to be split affecting the two backlogs by even more items, which in addition requires an increased effort for the assessment on possible similarities and dependencies between the items. 4. Evaluation We conducted three semi-structured interviews with a set of the questions prepared in advance. The participants were Scrum Masters working in software houses for more than 2-3 years. They were asked about their concerns and arguments of the proposed process, the roles’ responsibilities, and about efficient backlog and user story management for facilitating a smooth inclusion of the data- driven requirements elicitation, in parallel with the stakeholder-driven. The interviews lasted around three hours. The participants were fully familiar with the Scrum method and RE, while their knowledge about Data-Driven Requirement Engineering were little limited. For that reason, they were provided by some DDRE material in advance: As the main artifact for evaluation, they obtained our initial proposal for the integrated process (i.e., the non-evaluated content of section 3.1). In the beginning it has been reflected with the each of the participants that the proposed process i) should not decrease the quality of the developed product; that no changes shall be made that would severely endanger the project goal terms of cost, time; the product backlog should continue to be refined based on the respective needs of the users; the scope of the product shall be refined and negotiated with the Product Owner and the stakeholders as the plan progresses. A brief summary of the obtained feedback is presented in the list below: • All three experts agreed that elicitation of the data-driven requirements should be integrated with the traditional (stakeholder-driven) process into a single process; • The experts argued that integration should be done in the way to preserve the (agile) practices of Scrum as much as possible; • The experts suggested that the data-driven user-stories should be, regardless their different source type, assessed in the process similar to stakeholder user stories wherever possible, to facilitate simplicity and transparency of the process; • The experts helped substantially in detailing the Scrum actions, finalising their ordering, as well as the flow of the connections (Figure 1) as even the traditional Scrum process has been rarely published as a detailed workflow; • The experts emphasised that depending on a Scrum project, i.e., the product to be developed, the need for data-driven requirements may vary; that is, based for example on the amount and quality of the available digital data, the expertise needed to process digital sources, automate data derivation, and other • The experts contributed to the demonstration of the proposal (section 5) by giving the comments about the process for the two presented situations. In conclusion, the experts acknowledged that DDRE has become an essential part for creating consumer-centric, user-friendly systems, and without defects. Stakeholders always have a different perspective from the end-user, according to these experts; therefore, collecting massive requirements from end-users is very useful for the adoption of the product. DDRE may play an important role in the Scrum environment by reducing time and costs owing to increased automation of requirements processing, but the efficiency of handling and grouping all the user stories derived from digital sources is a crucial aspect of the process in order to avoid overloading the Scrum team and deviating from the product goal and time plan. One of the experts mentioned that DDRE should not be used during the initial sprints of a Scrum project where the system is not yet concretised but all of them agreed that DDRE will affect the elicitation process more and more as the project moves forward, while the stakeholders’ workload will be reduced. 5. Demonstration The Greek government has recently promoted a new type of financial support for citizens with specific social and financial status (low income, marital status, number of children, etc.). This governmental measure is called “pass”. It is a type of voucher and can be used in associated companies such as supermarkets, gas stations, electronic stores, etc. For this study, we focus on the “fuel pass” service (https://www.gov.gr/en/ipiresies/polites-kai- kathemerinoteta/metakineseis/fuelpass). A citizen applies for the “fuel pass” to receive two vouchers that can be used at specific gas stations. After the application is submitted, the system checks the type of the reported vehicle, the income of the applicant, and several parameters. If the application is approved, the citizen receives two digital vouchers valid for a certain time period. Otherwise, the application is declined, and the citizen is informed accordingly. The system provides also an online forum on which the users are able to post their feedback, complaints, or suggestions for improvements of the service (https://www.gov.gr/en/contact). 5.1. Case A Table 2 User’s post on the online forum Body I cannot upload picture file, please fix this before the application deadline! Service Forum Sender A registered user Importance High Following the process from Figure 1, the obtained post is first processed using the DDRE approach developed in [16] that involves the use of NLP techniques and the supervision by the data analyst team member (recall section 3.2), and where as the output a user story candidate item is obtained – containing a partial requirement description: Table 3 User story candidate item #1 derived from the online Forum using [16] Body Role Registered User Functionality Upload image file Benefit N/A Conversation before application deadline Confirmation N/A Priority High Derived Card As a registered user I want to upload image file After the creation of this candidate item (item #1), the Product Owner, the Scrum Master, and Stakeholders review the user story candidate. Since it is derived from digital sources and is incomplete, further analysis is required (Figure 3, left). This analysis is conducted with available human stakeholders since the author of the user story item is unknown and cannot provide further details. The user story item is completed by deciding on its benefit, and by deriving the conversation part defining exactly the types of attachments should be supported. Another aspect of the respective item could be the size or the number of attachments. Figure 3: left) Review and analysis of user story #1 derived from digital sources; right) Ranking, illustrating, and splitting the user story item #1 to user story items #2 and #3 Apart from the analysis, ranking is required (Figure 3, right). Item #1 is marked with high priority, therefore, the stakeholders, the Scrum Master, and the Product Owner decide to which rank in the product backlog it will be moved. Illustration for this user story is not needed as the upload functionality is already implemented (just not for the image type of attachment). However, the team decides that the user story item will be split into two different user stories having better-limited scopes: one focusing on supporting additional file formats, and the other for enabling uploading more than one file. These new user story item #2 (Table 4) and item #3 (Table 5) are presented below: Table 4 User story item #2 Card As a registered user, I want to upload image file as attachment to be able to upload different types of files other than pdf Body Role Registered User Functionality Upload .jpeg, .png, or .doc/.docx file as attachment Benefit To upload different types of files other than pdf Conversation The development team assesses how the attachments will be stored in the database or the server files. Confirmation - The system shall allow users to upload .png, .jpeg, .png, .doc, .docx files. - The system shall not allow users to upload files with other extensions. - The system shall not allow users to upload multiple files. - The system shall allow users to not upload any extra attachments. Priority High Table 5 User story item #3 Card As a registered user, I want to upload more than one file as attachment to submit the application Body Role Registered User Functionality Upload more than one file as an attachment Benefit To upload more files as extra attachments for the submission of the application. Conversation - The development team shall assess the architect of the system to accept more than one extra attachment. - The development team with the Scrum Master shall discuss the maximum number of extra attachments. Confirmation - The system shall allow users to upload only one extra attachment. - The system shall allow users to upload two attachments. - The system shall not allow users to upload more than 10 extra attachments. Priority Medium The user story cards #2 and #3 are added to the product backlog as candidate items. During the planning phase, the Product Owner and the stakeholders check the product backlog items and decide whether these items will be included in the next sprint. User story #2 (Table 4) is planned for the next sprint, while user story #3 (Table 5) with medium priority will be planned on another sprint. At this phase of the project, the Development Team also assesses whether the implementation of the planned product backlog item requires the refinement of the system architecture. In this case, the developers assess how the system will allow the user to upload new file types or where these files will be stored on the server, but no further adjustments are required (Figure 1). The Scrum Master adds the planned product backlog item to the sprint backlog and the implementation begins. If more clarifications are needed – for this case, how the user will upload the file or which message will be displayed in case of incorrect extension files, the Development Team discusses that with the Scrum Master. At the end of the sprint, a new executable version of the software is created, i.e., after the testing activities are completed and confirmed that the system is following the requirement. In this case, the developer checks that the user is allowed to upload an image file while he tries to submit his application for the “fuel pass”. 5.2. Case B In the current software version, there is already a product backlog item, #4, (Table 6), referring to the ability to delete an application if the due date has not been passed: Table 6 User story item #4 (in the product backlog) Card As a registered user I want to delete my application if the due date has not been passed so that I can submit another application Body Role Registered User Functionality Delete the user’s record in the applications table Benefit Make possible to submit a new application Conversation - The development team assesses how the user deletes the latest submitted application when the due date has not passed. Confirmation - The system shall allow users to delete the latest application if it has not been processed and the current date is equal to or less than the due date. - The system shall not allow users to delete the latest application if it has been processed. - The system shall not allow users to delete the latest application if the current date is greater than the due date. Priority High However, it has been processed that some users wrote on Forum the posts (translated to English): “why can’t I remove my application?” or “it should be possible to delete the application after due date so that I use my support later”; this is because they have changed their mind and want to keep the right for the financial support for another occasion in the future. In Table 7 the user story candidate item #5 aggregated from several similar forum posts (using the approach in [16]) is presented: Table 7 Candidate user story item #5 derived from the Forum using [16] Body Role Registered User Functionality Delete application after due date Benefit use the support right Conversation N/A Confirmation N/A Priority High Derived Card As a registered user I want to delete my application after due date As described in the example case A, after the creation of this candidate item, the Product Owner, the Scrum Master, and the available Stakeholders review the user story candidate item #5 (Table 7). Since it is derived from digital sources and is incomplete, the analysis is required. However, the analysis of item #5 is not executed to the same extent again since there is a related user story item #4 (Table 6) already placed in the product backlog. The Product Owner and the Scrum Master match the new user story #5 with the existing user story #4 enabling the user story card #5 to be updated and completed. Then, the Product Owner and the Scrum Master in collaboration with the Stakeholders rank the user story #5 and decide whether user story #5 will be added to the product backlog and implemented on a next sprint. They have decided to include it on the product backlog and therefore, user story #4 will be removed since it contradicts accepted user story #5. In the next sprint, the Product Owner, the Stakeholders, and the Scrum Master plan to add user story #5. 6. Discussion and conclusions The emerging presence of DDRE in the software development and the need to therefore integrate it into the agile methodology, specifically in the Scrum method, has been the main motivation for this study. We have proposed a process that can combine stakeholder-driven and data-driven requirements in the Scrum environment in order to benefit from both requirements engineering approaches and to address integration issues in one of the most commonly used Agile methods. The challenge was not only to combine both requirements elicitation approaches but also to adjust them in the Scrum methodology with the least possible impact on the established Scrum practices. The proposed integrated process is based on a workflow created by the authors to present how user stories are created, reviewed, planned, and implemented in Scrum when they are derived from stakeholders. Using unintended data from digital sources in the requirements elicitation activity leads to increased customers satisfaction, more transparent decision-making operations, as well as the time and cost management can be improved owing to increased automation of the elicitation. These are some of the main reasons for the Product Owner to, with the rest of the Scrum team, decide to exploit the user stories derived from digital sources equally as those created by the stakeholders. In contrast with the user stories derived only from the stakeholders, candidate items from digital sources require further analysis. Most of the time, they are ambiguous, vague, and incomplete. During the review of a candidate user story item, the stakeholders and the Scrum Master decide whether it will be added to the Product Backlog, declined, or split into smaller items. Sometimes it is possible that one candidate backlog item is related to an existing one. The relation with existing implementations shall be detected as soon as possible in order to decide if this item will be discarded or developed on a next sprint. Furthermore, ranking is one of the important activities. Given the time plan and the product goal, a user story is higher or lower prioritised in comparison to the rest product backlog items by the Product Owner and the Stakeholders. Understanding correctly the importance of data-driven user stories based on the impact-related data from the online sources, and the magnitude of similar information, is a new responsibility in the integrated process. Digital sources are constantly providing data and user stories to be added, therefore, the planning and review of them may cause overload in the Scrum team. A solution to this challenge is for the Product Owner and the Scrum Master to invite the stakeholders to provide advice, spot and relate the most important items, and discard the items which are less irrelevant to the product goal. In addition to the practical experience of the first author in the domain of RE, agile methodology and its methods, and the corresponding scientific knowledge of both authors, the reliability of this study has been increased by the evaluation process because all of the interviewed Scrum experts have recognized the importance of integration of DDRE to the Scrum method, as well as they provided numerous useful comments and suggestions for improvements. The study distinguishes itself from related publications as it provides a contribution with the perspective on an entire development method, in particular – Scrum. The proposal provides some relevant insights for scientific research and essential concepts for practical application. At this point, consideration of the proposal of this study in agile projects is one goal, for the validation purpose and further learning; another our goal is more scientific and concerns increasing the efficiency of the proposed process by increased automation of the integrated Scrum activities such as different analysis, ranking, and structuring actions, for reducing and streamlining the responsibilities of the Scrum team. References [1] M. Cohn. User stories applied: For agile software development. Addison-Wesley signature series. Addison-Wesley (2004) [2] K. Schwaber. Scrum development process. Oopsla’95 workshop on business object design and implementation. Austin, USA (1995) [3] K. Schwaber & J. Sutherland, J. Scrum Guide | Scrum Guides. online] Scrumguides.org. (2020) Available at: https://scrumguides.org/scrum-guide.html [4] V.N. Vithana. Scrum Requirements Engineering Practices and Challenges in Offshore Software Development. International Journal of Computer Applications (0975 – 8887), Volume 116, No. 22 (2015) [5] J. Zdravkovic, J. Stirna, J. C. Kuhr, & Hasan Koç. Requirements Engineering for Capability Driven Development. In: The Practice of Enterprise Modeling. PoEM. Lecture Notes in Business Information Processing, vol 197. Springer, Berlin, Heidelberg (2014) [6] K. Pohl. Requirements Engineering: Fundamentals, Principles, and Techniques. Springer, Heidelberg, New York (2010) [7] W. Maalej, M. Nayebi, T. Johann & G. Ruhe. Toward Data-Driven Requirements Engineering. IEEE Software, 33(1), 48–54, (2016) [8] C. Quer., X. Franch, C. Palomares, A. Falkner, A. Felfernig, D. Fucci, W. Maalej, J. Nerlich, M. Raatikainen, G. Schenner, M. Stettinger, & J. Tiihonen. Reconciling Practice and Rigor in Ontology-based Heterogeneous Information Systems Construction. In: Proc. of the Practice of Enterprise Modeling (PoEM), LNBIP vol.335, Springer, pp. 205-220, Springer (2018) [9] S. Lim, A. Henriksson, & J. Zdravkovic. Data-Driven Requirements Elicitation: A Systematic Literature Review. Springer Nature Computer Science vol. 2/16 (2021) [10] K. Peffers, T. Tuunanen, M. Rothenberger, & S. Chatterjee: A Design Science Research Methodology for Information Systems Research. Journal of Management Information Systems 24(3), pp. 45–77 (2007) [11] D. Maximini. SCRUM CULTURE: introducing agile methods in organizations. S.L.: Springer (2019) [12] X. Wang, L. Zhao, Y. Wang, & J. Sun. The Role of Requirements Engineering Practices in Agile Development: An Empirical Study. Requirements Engineering, [online] pp.195–209. doi: https://doi.org/10.1007/978-3-662-43610-3_15 (2014) [13] M. Cardinal. Executable Specifications with Scrum. Addison-Wesley (2013) [14] D. Firmani, M. Mecella, M. Scannapieco, & C. Batini. On the Meaningfulness of Big Data Quality. Data Science and Engineering, vol. 1(1), pp. 6–20 (2015) [15] M. van Vliet, E.C. Groen, F. Dalpiaz & S. Brinkkemper. Identifying and Classifying User Requirements in Online Feedback via Crowdsourcing. Requirements Engineering: Foundation for Software Quality, REFSQ. Lecture Notes in Computer Science, vol 12045. Springer, pp 143- 159 (2020) [16] A. Henriksson & J. Zdravkovic. Holistic Data-Driven Requirements Elicitation in the Big Data Era. Software and Systems Modeling, Springer, vol. 21 pp. 1389–1410 (2021) [17] M. Oriol, S. Martínez-Fernández, W. Behutiye, C. Farré, R. Kozik, P. Seppänen, A. M. Vollmer, P. Rodríguez, X. Franch, S. Aaramaa, A. Abhervé, M. Choraś, & J. Partanen. Data-driven and tool- supported elicitation of quality requirements in agile companies. Software Quality Journal (2020) doi: https://doi.org/10.1007/s11219-020-09509-y [18] X. Franch, A. Henriksson, J. Ralyté, & J. Zdravkovic. Data-Driven Agile Requirements Elicitation through the Lenses of Situational Method Engineering. IEEE International Requirements Engineering Conference, IEEE Computer Society, pp 402-407 https://ieeexplore.ieee.org/document/9604733 (2021)