Risk Identification of Data Science Projects: A Literature Review Maike Holtkemper, Maria Potanin, Alexander Oberst and Christian Beecks University of Hagen, Department of Mathematics and Computer Science, Chair of Data Science, Germany Abstract While cost, time, and resources are considered to have a high impact on data science projects, risks are the key for the successful implementation of a project. The correct handling of risks has been proven to increase a project’s chances of success. Therefore, awareness of existing and emerging risks as well as their assessment play a major role in data science project management. In 2021, 87 percent of data science projects fail and are thus not implemented successfully. The path of successful implementation of data science projects in companies is more complex and uncertain than for conventional projects, in particular the identification and availability of the necessary skills in a company before and during the project. Regarding software engineering projects, the analysis and evaluation of potential risks is well-known, but for data science projects potential risks have not yet been analyzed at a larger scale. To identify the potential reasons for failure of data science projects, an in-depth understanding of potential risks and measures for mitigation these risks is inevitable. In this paper, we conduct a systematic literature review on risks of data science projects and present the top fifteen risks as first findings. Furthermore, we compare the identified risks to the major risks of software engineering projects to highlight similarities and differences between these two disciplines. The findings of our literature review can be used in guiding the development of future risk assessment systems for complex data science projects. Keywords Data Science, Projects, Risk Identification 1. Introduction Data science projects have become an integral part of today’s companies. Leveraging data has proven essential in the ongoing competition for market share and competitiveness [38]. Instead of the service or product itself, the collected data is often considered as valuable asset, which can be used for developing innovative business models such as ”Smart Services” [30]. In a representative study conducted by MIT [35] of 2602 managers, executives, and data professionals, it has been demonstrated that companies were able to gain a competitive advantage, optimize existing processes and increase in the development of innovative business models through the use of data and analytics. In addition, the study also reveals that these companies were able to supplement human skills through the use of smart machines, thus reducing time-consuming tasks [35]. However, the path to successful implementation of data science projects in companies is more complex and uncertain than for conventional projects, ”a multidimensional challenge requiring specialised tools, processes and methodologies” [26], in particular the identification and availability of the necessary skills in a company before and during the project [11]. In LWDA: Learning, Knowledge, Data, Analysis 2023, October 09–11, 2023, Marburg, Germany © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 2013, ”55 percent of big data projects don’t get completed, and many more fail their objectives” [19]. In 2015, Gartner [14] states that ”through 2017, 60 percent of big data projects will fail to go beyond piloting and experimentation, and will be abandoned”. In 2021, 87 percent of data science projects fail and do not get deployed [53]. To address this challenge, data science process models [18] have been developed to assist during the realization of a data science project. While process models in software projects are commonly used and well-evolved over time, they gain in importance for data science [26]. A comparison between seven data science process models shows, that four of seven highlight aspects of project management [26], but pre- and post related project tasks such as risk management are poorly represented. Risk management can not only help to create an awareness for existing risks at the beginning of a data science project, but also to make the project team sensitive to emerging risks during the project duration. In this way, challenges can be recognized and addressed at an early stage. Therefore, the first step in risk management is the identification of risks to assure the project success [47]. For software engineering projects, several risk factors have already been identified. Keil et al. [23] published a framework for identifying software project risks and in the last twenty years, critical factors and a top ten risk lists for software project failure have been published [1, 4, 13, 15, 24, 44]. Compared to software development, data science is still quite young and has only gained importance in recent years [36], so that there is considerably less literature available to date in the area of risk management and project risks, which focus on data science-specific aspects such as cross-disciplinary competences of team members. This paper aims to investigate the risks of data science projects. First, a literature review on risks of data science projects will be performed and second, the identified risks will be categorized. Therefore, the literature review aims to answer the following research question (RQ1): What are the major risks of data science projects? The top fifteen identified risks will be then described and compared to the most common risks of software engineering projects, which leads to the second research question (RQ2): What are the similarities and differences between the risks of a software engineeering project and those of a data science project? The paper is structured as follows: First, the background is described in Section 2. Second, the methodology of the data collection is described in Section 3. Third, the findings are presented and discussed in Section 4. Finally, the future work as well as the limitations are given in Section 5. 2. Background Based on Muhlbauer, risk can be defined as result in potential failure or loss [32]. In the context of a project, risk can be defined as an event that is uncertain with negative impact or positive objectives in case of occurrence [33]. This led to the standard definition of risk. According to standard ISO 31000:2018, risk is defined as the ”effect of uncertainty on objectives” [21]. An effect is described as deviation from the expected and can be positive, negative, or both, and can address, create, or result in opportunities and threats [21]. Objectives can have different aspects and categories and can be applied at various levels [21]. Regarding risk, important terms are sources, events, consequences and likelihood [20]. Risk sources identify where risks can originate, risk events denote the concrete realization of a risk, risk consequence implicates the potential outcome of the risk event and risk likelihood is a qualitative assessment that describes how likely a risk will occur. The related term risk management then explains the coordination and controlling of risk activities within the organizations, which includes risk assessment, risk treatment, risk acceptance and risk communications. [20, 21, 22] Risk assessment is the overall process of risk identification, risk analysis, and risk evaluation. The purpose of risk identification is to find, recognize, and describe risks that prevent or help an organization from achieving its objectives. By identifying risks, the fundamental questions are what, where, when, why, and how a potential risk could happen and then categorizing risk elements [21]. According to the same standard, risk analysis is a systematic process for understanding risk and its characteristics. It involves a detailed view of uncertainties, sources of risk, consequences, probability, events, scenarios, controls and their effectiveness. An event can have multiple causes and consequences, which in turn affect multiple objectives. In terms of analysis techniques, these can be qualitative, quantitative in nature, or a combination of both. Risk analysis provides input to risk assessment and decisions about whether and how a risk should be treated. Risk assessment involves the comparison of the results of the risk analysis with the established risk criteria to determine where additional measures are required [20, 21]. According to Grassi et al. [17], risk evaluation is the most important task in risk assessment. Although risk assessment is an important part of software project management, it is still at its infancy in data science projects and thus highly underestimated and often not performed at all. As Kutzias et al. [26] demonstrated, recent data science process models such as CRISP- DM or DASC-PM emphasize project management, but neglect pre- and post project-specific tasks, which also include risk assessment. In software projects, many risks occur during the process of creating the software, where the risk may lie, for example, in understanding the requirements, integrating modules or feasibility [37]. Risk identification in the design phase is crucial because ”if senior manager fail to detect such risks, it is possible that such projects may collapse completely” [31]. To create awareness of risk assessment within data science projects, this paper focuses on identifying potential risks by means of a literature review. The methodology is described in the following section. 3. Research Method Based on the recommendations of Webster and Watson [52], a literature review is conducted, including keyword and backward search. According to the procedure of vom Brocke et al. [49], the process is documented. The procedure of the literature review process is summarized in Figure 1. The keyword investigation is focused on risks of data science projects. In the first step, the scientific literature databases Scopus, IEEE Explore, EBSCOhost and ACM Digital Library were searched by defined search queries (cf. Table 1). As a first result, 354 papers were found. In a second step, the duplicates were removed, so that 314 paper remained. In a third step, the abstracts of these remained paper were evaluated according to their content, leading to 57 paper which were considered to be relevant. In a fourth step, the entire content of these papers was evaluated by extracting named risks. As a result from the keyword search 24 papers (cf. Table 2) and the therein mentioned 143 risks Figure 1: Literature Review Process Table 1 Search Terms Search string Scopus EBSCOhost IEEE Xplore ACM Digital (”Data Science Projects” OR 219 91 39 5 ”Big Data Projects” OR ”Machine Learning Projects” ) AND (”Risk*” OR ”Challenge*“) Total 354 were considered relevant for a further backward search. In a fifth step, the references of the 24 relevant papers from the keyword search were searched by further papers regarding risks of data science projects. As a result from the backward search, 16 papers (cf. Table 3) were added to the total amount of relevant papers as well as additional 105 identified risks. As a final result, a total of 248 risks of data science projects were examined by the aforementioned literature review process. Table 2 Findings of Keyword Search Title Reference Year Search Type Demystifying Data Science Projects: A Look on Aho et al. 2022 keyword the People and Process of Data Science Today On the Application of SCRUM in Data Science Projects Kraut and Transchel 2022 keyword Toward a Conceptualization of Big Data Value Chain: Louati and Mekadmi 2022 keyword From Business Problems to Value Creation On the Experiences of Adopting Automated Data Lwakatare et al. 2021 keyword Validation in an Industrial Machine Learning Project The Risk Management Process for Data Science: Lahiri and Saltz 2022 keyword Gaps in Current Practices Risks of Data Science Projects- Varela 2022 keyword A Delphi Study Analyzing a Data Science Online Practitioner Tacheva et al. 2022 keyword Community: Trends and Implications for Data Science Project Management Title Reference Year Search Type Evaluating Data Science Project Agility by Exploring Lahiri and Saltz 2023 keyword Process Frameworks Used by Data Science Teams Managing and Composing Teams in Data Science: Aho et al. 2021 keyword An Empirical Study A survey study of success factors in data science Martinez et al. 2021 keyword projects Nine Questions to Evaluate a Data Science Saltz 2022 keyword Team’s Process: Exploring a Big Data Science Team Process Evaluation Framework Via a Delphi Study Don’t Be Afraid of Failure—Insights from a Survey Aßmann 2023 keyword on the Failure of Data Science Projects An iterative and incremental data preprocessing Lai and Leu 2017 keyword procedure for improving the risk of big data project Bad big data science Haug 2016 keyword Identifying critical issues in smart city big Barham and Daim 2018 keyword data project implementation Big data and business analytics: Trends, Ajah and Nweke 2019 keyword platforms, success factors and applications Evolutionary Computing Environments: Implementing Malik and Singh 2020 keyword Security Risks Management and Benchmarking Privacy, security and legal challenges in big data Abdullah 2018 keyword Significance and challenges in big data: A survey Jothi et al. 2016 keyword The need for an enterprise risk management framework Saltz and Lahiri 2020 keyword for big data science projects Significance and Challenges of Big Data Research Jin et al. 2015 keyword Big data project success - A meta analysis Koronios et al. 2014 keyword Five Reasons Why Your Data Science Project Preimesberger 2019 keyword is Likely to Fail. Inadequate infrastructure halting big data Connolly 2015 keyword projects. Total Findings of Keyword Search 24 Table 3 Findings of Backward Search Title Reference Year Search Type Towards an Improved ASUM-DM Process Methodology Angée et al. 2018 backward for Cross-Disciplinary Multi-organization Big Data and Analytics Projects The Age of Data: What You Need to Know About Aust 2021 backward Fundamentals, Algorithms, and Applications Achieving Agile Big Data Science: The Evolution Saltz and Shamshurin 2019 backward of a team’s Agile Process Methodology. Title Reference Year Search Type Comparing Data Science Project Management Saltz et al. 2017 backward Methodologies via a Controlled Experiment SKI: An Agile Framework for Data Science. Saltz and Suthrland 2019 backward Exploring Project Management Methodologies Saltz et al. 2018 backward Used Within Data Science Teams. Progressive Data Science: Potential and Challenges. Turkay et al. 2018 backward Addressing barriers to big data Alharthi et al. 2017 backward Data-intensive applications, challenges, techniques Chen and Zhang 2014 backward and technologies: A survey on Big Data Beyond the hype: Big data concepts, methods, Gandomi and Haider 2015 backward and analytics Data science: challenges and directions. Cao 2017 backward Critical success factors for managing data Limesha 2021 backward science projects within agile methodology Big-data/analytics projects failure: a literature review Reggio 2020 backward Data Management Risks: A Bane of Construction Tanga et al. 2022 backward Project Performance A Critical Quality Measurement Model for Managing Lai et al. 2018 backward and Controlling Big Data Project Risks Top Ten Lists of Software Project Risks: Arnuphaptrairong 2011 backward Evidence from the Literature Survey Total Findings of Backward Search 16 Total Findings of Keyword and Backward Search 40 4. Results As described in the previous section, a total of 248 risks emerging in data science projects have been identified through our literature review process. Since these risks can have different wordings depending on the literature source, the next step is to classify the risks into categories. As a result, we obtain 29 risk categories of varying frequency. These categories are summarized in Table 4. As can be seen in the table above, ”Insufficient project management” is the risk category comprising the most frequently mentioned risks. This category includes risks such as ”poor task communication”, ”poor time management”, ”lack on focus on process and team coordination” or ”plan cost overrun and schedule delay”. In literature, these project management risks dominate over technical issues. Saltz et al. [40] criticizes for example that ”data science projects need to focus on people, process and technology” and that in most cases immature processes risk are among others responsible for a project failure [42]. The risk category with the second most frequently mentioned risks is ”Data security and privacy”, which includes security concerns such as ”cyber attacks” or ”data privacy” concerns. The third risk category, named ”Poor data availability, quality, and timeliness”, includes risks such as ”bad data quality”, ”broken data”, ”limited data access” or ”timeliness of data”. In literature, data quality issues are mentioned in a manifold way: in case of data cleansing, impact on model accuracy or high complex but faulty data sets [46]. Risks, which are categorized in ”Lack of data science competence/skills”, were also named quite often and reflect the challenge for companies to find skilled data scientists, data analysts or machine learning engineers on the labor market to handle successful data Table 4 Risk categories No. Risk category Frequency 1 Insufficient project management 87 2 Data security and privacy 26 3 Poor data availability, quality, and timeliness 20 4 High complexity 18 5 Lack of data science competence/ skills 15 6 Poor technical development/deployment practices 14 7 Insufficient data and information management 11 8 Model accuracy 9 9 Poor communication with customer/stakeholders 7 10 Organizational culture 6 11 Poor user management 5 12 Poor customer expectation management 4 13 Poor requirement management 4 14 Poor team management 3 15 Data inconsistency and incompleteness 2 16 New technology 2 17 Uncertainty about project outcome 2 18 Data ownership unclear 2 19 Poor domain knowledge 1 20 Insufficient documentation 1 21 Poor maintenance planning 1 22 Insufficient infrastructure 1 23 Poor data verification 1 24 Publication bias 1 25 Operational risks 1 26 Market risks 1 27 Political risks 1 28 Data dependency risks 1 29 No interaction with analytics-based program 1 Total 248 science projects [12, 48]. Our findings are summarized in Figure 2, answering the first research question (RQ1). With regard to the second research question (RQ2), software engineering is concerned with the development of software and thus, a software engineering project includes the design, implementation and testing of the software. In addition, the planning of a software system, the requirement analysis and the maintenance are added to the design process [8, 51]. To structure a software engineering project, process models such as the spiral model are used, which is a risk-driven procedure model for software development and follows the principle of the repeated run of its partial steps: description of the basic conditions with definition of the objectives, evaluation of the identified alternative solutions to mitigate or avoid any risks, development and reflection of an intermediate product and planning of the next iteration [51].Despite this close-meshed approach, not all risks can be reduced. Figure 2: Literature Review Process The top ten risks in software engineering projects are insufficient requirement management, lack of management commitment and lack of project management methodology [1, 4, 13, 15, 16, 24, 44]. Ghazali et al. [16] conducted a literature review, which categorized the identified risks in ”management risks”, ”people risks” and ”technology risks” and conclude that management risks are the highest risks compared to people and technology risks. These management risks include, for example, ”project milestones undefined”, ”requirements change”, ”lack of agile progress tracking mechanism” as well as ”lack of resources”, ”failure to manage end-user expectation” or ”lack of management commitment” [16]. With regard to data science project risks, the insufficient project management, which is comparable to Ghazali’s management category, is also the highest risk to occur. We can conclude that the risk of insufficient project management techniques on software engineering projects as well as on data science projects is high and should not be underestimated. Regarding Ghazali’s category ”people risks”, risks such as ”lack of necessary skill-set”, ”ex- perience and training problem”, ”lack of team work” or ”unmotivated team member” address the risk of an insufficient team management, which is also a common problem of data science projects [16]. In addition, the goal of a software engineering project is to develop a product that is useful to the end-user and if the end-user has difficulties using the final product, then it’s a considerable risk. Therefore, frequent testing is vital for software engineering projects. Compared to data science projects, a successful customer expectation management is also crucial for the project outcome. The last category of Ghazali’s literature review, ”technical risks”, includes risks such as ”lack of key technology”, ”inappropriateness of technology and tools” or ”processor management insufficient”, which appear less frequent in data science projects [16]. Compared to Ghazali’s ”technical risks”, the risk of ”Poor data availability, quality, and timeliness” is one of the highest risks to occur in data science projects and can contribute significantly to the failure of the project. 5. Conclusion This paper reports the results of a literature review on risks of data science projects. Through this literature analysis, relevant sources were collected and potential risks identified and categorized (RQ1). At the end, 354 sources were found, whereas in 40 papers 248 relevant risks were documented. After cleaning the results, 248 risks were categorized by main term and the most frequently mentioned risk categories were presented. The results show the need for a more detailed risk assessment to assist the project manager during the project duration. Risks, which are summarized by the term ”insufficient project management”, can be addressed through a frequent risk sensitivity analysis to highlight, for example, upcoming time schedule challenges right at the beginning to avoid project failure. Risks, such as ”data security and privacy” risks, as well as ”poor data availability, quality, and timeliness”, which effect the project outcome should be aware as early as possible in the project. Therefore, a risk assessment, which evaluates the identified risks of a data science project at the beginning and during the project period is vital. Furthermore, the similarities and differences between the risks of a software engineering project and those of a data science project (RQ2) have been described. The comparison between the risks of a software engineering project and those of a data science project show that there are similarities between both disciplines, especially regarding the risks of a insufficient project and team management. Regarding the technical risks, data science projects have a particularly high risk of failure if, for example, the data is not available or of poor quality, while software engineering projects fail less frequent due to technical risks. The limitations of this literature analysis lie on the one hand in the definition of the search queries and on the other hand in the naming and assignment of risks to the individual categories, since these are always shaped by subjective criteria such as personal level of knowledge and experience. Regarding the risk assessment, the first step of risk identification was successfully performed in this paper. As an outlook to future work, these categorized risks form the basis for the development of a method for automated risk assessment of data science projects. Among others, discrete multi-attribute decision-making (MADM) methods and conitnuous multi-objective decision-making (MODM) methods [10], are considered for this purpose. 6. Acknowledgement Parts of this paper were conducted as part of the DS3W project at the Research Center Work – Education – Digitalization. This project is funded by the Ministry of Culture and Science of North Rhine-Westphalia, Germany. References [1] Addison, T., Vallabh, S. (2002, September). Controlling software project risks: an empirical study of methods used by experienced project managers. In Proceedings of the 2002 annual research conference of the South African institute of computer scientists and information technologists on Enablement through technology (pp. 128-140). [2] Alharthi, A., Krotov, V., Bowman, M. (2017). Addressing barriers to big data. Business Horizons, Volume 60, Issue 3, 2017, Pages 285-292. [3] Angée, S., Lozano, S., Montoya-Munera, E., Ospina Arango, J., Tabares, M. (2018). Towards an Improved ASUM-DM Process Methodology for Cross-Disciplinary Multi-organization Big Data and Analytics Projects: 13th International Conference, KMO 2018, Žilina, Slovakia, August 6–10, 2018, Proceedings. [4] Arnuphaptrairong, T. (2011). Top Ten Lists of Software Project Risks: Evidence from the Literature Survey. In: Proceedings of the International MultiConference of Engineers and Computer Scientists 2011 Vol I, IMECS 2011, March 16 - 18, 2011, Hong Kong. [5] Asay, M. (2017). 3 ways to massively fail with machine learning (and one key to success)”, https://www.techrepublic.com/article/3-ways-to-massively-fail-with-machine- learning-and-one-key-to-success/. [6] Aßmann, J., Sauer, J., Schulz, M. (2023). Don’t Be Afraid of Failure—Insights from a Survey on the Failure of Data Science Projects. In Apply Data Science: Introduction, Applications and Projects (pp. 65-76). Wiesbaden: Springer Fachmedien Wiesbaden. [7] Aust, H. (2021). The Age of Data: What You Need to Know About Fundamentals, Algorithms, and Applications / Das Zeitalter der Daten: Was Sie über Grundlagen, Algorithmen und Anwendungen wissen sollten. Springer, Berlin. [8] Balzert, H. (2000). Lehrbuch der Software-Technik: Software-Entwicklung, 2. Auf., Heidel- berg, Spektrum Akademischer Verlag, 2000. [9] Cao, L. (2017). Data science: challenges and directions. Communications of the ACMVolume 60Issue 8August 2017pp 59–68https://doi.org/10.1145/3015456 [10] Djenadic, S.,Tanasijevic, M., Jovancic, P., Ignjatovic, D., Petrovic, D., Bugaric, U. (2022). Risk Evaluation: Brief Review and Innovation Model Based on Fuzzy Logic and MCDM. Mathematics 2022, 10, 811. https://doi.org/10.3390/math10050811 [11] Dukino, C., Kutzias, D., Link, M. (2022). Roles and competences of data science projects. The Human Side of Service Engineering, Vol. 62., AHFE International, pp. 250-255. [12] Eberhard, B., Podio, M., Pérez Alonso, A., Radovica, E., Avotina, L., Peiseniece, L., Sendon, M.C., Gonzales Lonzano, A., Solé-Pla, J. (2017). Smart work: The transformation of the labour market due to the fourth industrial revolution. International Journal of Business and Economic Sciences Applied Research, Vol. 10, Issue 3. [13] Elzamly, A., Hussin, B. (2015). Modelling and evaluating software project risks with quantitative analysis techniques in planning software development. Journal of computing and information technology, 23(2), 123-139. [14] Gartner (2015). Gartner says business intelligence and analytics leaders must focus on mindsets and culture to kick start advanced analytics. Gartner web site, https://www.gart- ner.com/en/newsroom/press-releases/2015-09-15-gartner-says-business-intelligence-and- analytics-leaders-must-focus-on-mindsets-and-culture-to-kick-start-advanced-analytics. Last visted on 19/07/2023. [15] Georgiev, V., Stefanova, K. (2014). Software development methodologies for reducing project risks. Economic Alternatives, 2, 104-113. [16] Ghazali, N., Fauzi, S., Gining, R., Sobri, W., Suali, A. (2020). Visualizing Software Risks in Software Engineering Projects using Risk Sensitivity Analysis Approach. Journal of Physics: Conf. Ser. 1529 022074. [17] Grassi, A., Gamberini, R., Mora, C., Rimini, B (2009). A fuzzy multi-attribute model for risk evaluation in workplaces. Safety Science, Vol 47, Issue 5, 707–716. [18] Haertel, C., Pohl, M., Nahhas, A., Staegemann, D., Turowski, K. (2022). Toward a Life- cycle for Data Science: Literature Review of Data Science Process Models. PACIS 2022 PROCEEDINGS. [19] Survey: What IT Teams Want Their CIOs to Know About Enterprise Big Data. https://www.prnewswire.com/news-releases/survey-what-it-teams-want-their-cios- to-know-about-enterprise-big-data-188190311.html, Last visited on 19/07/2023. [20] ISO Guide 73:2009 (2009). Risk Management—Vocabulary. International Standards Organi- sation: Geneva, Switzerland. [21] ISO 31000:2018 (2018). Risk Management—Guidelines. International Standards Organisa- tion: Geneva, Switzerland. [22] ISO/IEC 31010:2019 (2019). Risk Management—Risk Assessment Techniques. The Interna- tional Organization for Standardization and The International Electrotechnical Commission: Geneva, Switzerland. [23] Keil, M., Cule, P. E., Lyytinen, K., Schmidt, R. C. (1998). A framework for identifying software project risks. Communications of the ACM, 41(11), 76-83. [24] Khanfar, K., Elzamly, A., Al-Ahmad, W., El-Qawasmeh, E., Khalid, A., Abuleil, S. (2008). Managing Software Project Risks with the Chi-Square () Technique. International Manage- ment Review, 4(2), 18-29. [25] Kraut, N., Transchel, F. (2022). On the Application of SCRUM in Data Science Projects. 7th International Conference on Big Data Analytics (ICBDA). [26] Kutzias, D., Dukino, C., Kett, H. (2021). Towards a Continuous Process Model for Data Science Projects. In C. Leitner, W. Ganz, D. Satterfield, C. Bassano (Eds.), Lecture Notes in Networks and Systems. Advances in the Human Side of Service Engineering, Vol. 266, pp. 204–210. Springer International Publishing. [27] Lahiri, S., Saltz, J. (2022). The Risk Management Process for Data Science: Gaps in Current Practices. Proceedings of the 55th Hawaii International Conference on System Sciences. [28] Lai, ST., Leu, FY. (2018). A Critical Quality Measurement Model for Managing and Control- ling Big Data Project Risks. In: Barolli, L., Xhafa, F., Conesa, J. (eds) Advances on Broad-Band Wireless Computing, Communication and Applications. BWCCA 2017. Lecture Notes on Data Engineering and Communications Technologies, vol 12. Springer, Cham. [29] Limesha, G. (2021). Critical success factors for managing data science projects within agile methodology (Doctoral dissertation). [30] Marquardt, K. (2017): Smart services – characteristics, challenges, opportunities and business models. Proceedings of the International Conference on Business Excellence, Vol. 11, No. 1, pp. 789–801 [31] Mizuno, O., Kikuno, T. (2000). Characterization of Risky Projects based on Project Man- agers’ Evaluation. ICSE ’00: Proceedings of the 22nd international conference on Software engineering. pp. 387-395. [32] Muhlbauer, W. K. (2004). Pipeline risk management manual: ideas, techniques, and re- sources. Elsevier. [33] Pasaribu, R., Taufik, T. A. (2021). Risk Management Implementation at XYZ Project Using Failure Mode Effect Analysis and Hybrid Multi Criteria Decision Making. 2nd International Conference on Management of Technology, Innovation, and Project, 2020. [34] Pilliang, M., Munawar, M. (2022). Risk Management in Software Development Projects: A Systematic Literature Review. Khazanah Informatika Jurnal Ilmu Komputer dan Informatika. 8. 3. 10.23917/khif.v8i2.17488. [35] Ransbotham, S., Kiron, D. (2017). Analytics as a Source of Business Innovation. MIT Sloan Management Review. [36] Reggio, G., Astesiano, E. (2020, August). Big-data/analytics projects failure: a literature review. In 2020 46th Euromicro Conference on Software Engineering and Advanced Appli- cations (SEAA) (pp. 246-255). IEEE. [37] Rekha, J. H., Parvathi, R. (2015). Survey on Software Project Risks and Big Data Analytics. 2nd International Symposium on Big Data and Cloud Computing (ISBCC’15). Procedia Computer Science 50, pp. 295-300. [38] Robinson, E., Nolis, J. (2020). Build a Career in Data Science, New York: Manning Publica- tions Co.ISBN 9781617296246. [39] Saltz, J. S., Lahiri, S. (2020). The Need for an Enterprise Risk Management Framework for Big Data Science Projects. In: DATA (pp. 268-274). [40] Saltz, J., Shamshurin, I., Crowston, K. (2017). Comparing Data Science Project Management Methodologies via a Controlled Experiment. 10.24251/HICSS.2017.120. [41] Saltz, J., Shamshurin, I. (2019). Achieving Agile Big Data Science: The Evolution of a team’s Agile Process Methodology. 2019 IEEE International Conference on Big Data, pp. 3477-3485. [42] Saltz, J., Suthrland, A. (2019). SKI: An Agile Framework for Data Science. 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 2019, pp. 3468- 3476, doi: 10.1109/BigData47090.2019.9005591. [43] Saltz, J., Wild, D., Hotz, N., Stirling, K. (2018). Exploring Project Management Methodolo- gies Used Within Data Science Teams. Twentyfourth Americas Conference on Information Systems, New Orleans, 2018, pp. 1-5. [44] Schmidt, R., Lyytinen, K., Keil, M., Cule, P. (2001) Identifying Software Project Risks: An International Delphi Study, Journal of Management Information Systems, 17:4, 5-36, DOI: 10.1080/07421222.2001.11045662 [45] Tanga O, Akinradewo O, Aigbavboa C, Oke A, Adekunle S. (2022). Data Management Risks: A Bane of Construction Project Performance. Sustainability. 2022; 14(19):12793. https://doi.org/10.3390/su141912793 [46] Turkay, C., Pezzotti, N., Binnig, C., Strobelt, H., Hammer, B., Keim, D., Fekete, J.-D., Palpanas, T. Wang, Y., Rusu, F. (2018). Progressive Data Science: Potential and Challenges. [47] Varela, C., Domingues, L. (2022). Risks of Data Science Projects-A Delphi Study. Procedia Computer Science, 196, 982-989. [48] Verma, A., Yurov, K.M., Lane, P. L., Yurova, Y. V. (2019). An investigation of skill require- ments for business and data analytics positions: A content analysis of job advertisements. Journal of Education For Business, Vol. 94, Issue 4, pp. 1-8. [49] Brocke, J. V., Simons, A., Niehaves, B., Niehaves, B., Reimer, K., Plattfaut, R., Cleven, A. (2009). Reconstructing the giant: On the importance of rigour in documenting the literature search process. In: ECIS Proceedings 161. [50] Wallace, L., Keil, M. (2004). Software project risks and their effect on outcomes. Communi- cations of the ACM, 47(4), 68-73. [51] Weber, P., Gabriel, R., Lux, T., Menke, K. (2022). Basiswissen Wirtschaftsinformatik. Wies- baden, 4. Aufl., Springer Vieweg, 2022. [52] Webster, J. and Watson, R. T. (2002). Analyzing the past to prepare for the future: Writing a literature review. MIS quarterly. [53] Weiner, J. (2021). Why AI/Data Science Projects Fail. Morgan and Claypool Publishers, San Rafael, California.