<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>A. Kumarasinghe);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Generic requirements template for data analytics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aritha Kumarasinghe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marite Kirikova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Artificial Intelligence and Systems Engineering, Riga Technical University</institution>
          ,
          <addr-line>6A Kipsalas Street, Riga, LV- 1048</addr-line>
          ,
          <country country="LV">Latvia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Data analytics projects have become a common accomplishment in many enterprises. However, establishing a data analytics project requires consideration of many factors that are not always recognized at the very beginning of the project. This study seeks to identify what generic requirements must be defined for data analytics projects and proposes a requirements template for the generic requirements. The use of the template can reduce the complexity of starting the analytics projects by providing a checklist of necessary requirements to be considered at the beginning of the project. The template is derived from the analysis of eight data analytics project reports for descriptive and diagnostic analytics tasks validated against the two more accomplished analytics project reports. Additionally, the template has been applied to two ongoing projects to demonstrate the utility of the template within a project context.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Requirements</kwd>
        <kwd>Data Analytics</kwd>
        <kwd>Project</kwd>
        <kwd>Requirements Template1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Data analytics is the field of study that relates to systematically analyzing a real-world system by
using mathematical and statistical techniques on data that represents the system in question [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Within this paper a data analytics project is considered as consisting of the following four phases:
1. Initiation – this phase of the analytics project is the starting point and can consist of
project stakeholders defining the goal(s) of the project, creating a project plan, as well as
defining requirements.
2. Acquisition – this phase of the analytics project is undertaking the acquisition of data from
various data sources and transforming it into a format that allows for the application of data
analytics techniques.
3. Analysis – this phase consists of the application of data analysis techniques on the
acquired data to gain insights concerning the system to which the acquired data refers.
4. Presentation – this phase consists of conveying the insights produced by the analysis
phase to the stakeholders of the project.
      </p>
      <p>
        The requirements defined in the initiation phase influence the rest of the data analytics
project. Therefore, it is essential to have a full set of requirements to ensure that the phases that
follow are as smooth as possible. Establishing such requirements from scratch may require
dealing with high complexity due to several issues to be considered and initial ambiguity in
expectations regarding the results [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. To reduce this complexity, the paper proposes to look at
already accomplished data analytics projects to see whether any commonalities can be
transferred into a generic requirements template applicable in the initial phases of data analytics
projects.
      </p>
      <p>
        The ISO/IEC/IEEE International Standard—Systems and software engineering—Life cycle
processes—Requirements engineering, in ISO/IEC/IEEE 29148:2018(E) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] defines a
requirement as a statement that translates or expresses a need and its associated constraints and
conditions. In data analytics projects, a special emphasis should be placed on defining the
constraints that relate to a data analytics project, such as what data sources, datasets, software,
tools, and visualization means can be utilized within a specific analytics project [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Given this
unique nature of constraints for data analytics projects, it is important to respect their impact on
requirements engineering for data analytics projects.
      </p>
      <p>The goal of this paper is to define a requirements template that could be used in data analytics
projects for faster identification of a relatively complete set of generic requirements for the
project in its initiation phase.</p>
      <p>
        At the very beginning of the research, it was realized that the set of generic requirements
depends on the type of analytics, for instance, whether it is descriptive, diagnostic, predictive, or
prescriptive [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] analytics. In this paper only descriptive and diagnostic analytics are considered,
leaving predictive and prescriptive analytics out of the scope of the research. Descriptive analytics
utilizes data to show what happened or is happening whereas diagnostic analytics utilizes data
to find the cause of certain events (why something happened) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>This paper is organized as follows. The research method is described in Section 2. The analysis
process and results for descriptive and diagnostic analysis cases are discussed and demonstrated
in Section 3. Only a descriptive analytics case is used to demonstrate the process as the same
rationale was used for the diagnostic analytics cases. The proposed template is presented in
Section 4. Section 5 discusses the validity of the template and Section 6 briefly concludes the
paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Research method</title>
      <p>For identifying generic requirements, the bottom-up approach was used. The approach consisted
of the following two steps:
• we first analyze the analytics project cases regarding what requirements they follow and
• then try to generalize based on these requirements.</p>
      <p>Four projects were analyzed for each type of analytics (descriptive and diagnostic).</p>
      <p>The analysis of the descriptions of data analytics cases consisted of reviewing research articles
reporting on these cases. When reviewing the research articles, the sentences or phrases that
define a process or a product within the analytics project were identified. This product or process
definition was used to infer potential project requirements denoted by “Pro requirement”
followed by a number that takes the decimal format where the integer denotes the case number
and the number following the decimal point denotes the project requirement number. For
instance, Pro requirement 4.2 would be the second project requirement elicited through the
analysis of case number four. These project requirements contain text (within quotation marks)
from the relevant research articles to make traceable the fact that this requirement does relate to
that analytics project.</p>
      <p>Based on the defined project requirements, generic requirements that relate to all descriptive
or diagnostic analytics projects were derived. The generic requirements will act as a ‘requirement
for requirement’ and will be validated based on whether a project requirement that satisfies the
defined generic requirement can be defined in subsequent analytics projects. A generic
requirement has the same number as the first project requirement it was derived from.</p>
      <p>
        The template for generic requirements is organized according to specific project attributes
shown in Figure 1. The attributes, included in the figure were acquired from a total of 9 research
articles that explicitly stated (as is the case with [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) or allowed to derive the attributes (as is the
case with [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]). The detailed description of the attributes and reasons for their choice is out of the
scope of this paper and is available on GitHub [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These attributes were chosen to ensure that
the aspects of all analytics phases are properly considered at the very beginning of the project
when stating the requirements for the analytics task.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. The analysis of analytics cases</title>
      <p>
        This section demonstrates the analysis of analytics project cases and introduces generic
requirements for descriptive and diagnostic analytics projects. For each analytics type, four cases
were analyzed to gather the generic requirements and one case was used to test their
applicability. The analysis was carried out by analyzing published research articles (associated
with search terms: ‘Diagnostic Analytics’ or ‘Descriptive Analytics’) that relate to the analytics
project cases that were present in bibliographic databases IEEE Xplore, and ScienceDirect. For
both types of analytics considered in this paper, the analysis of one case is illustrated in this
section, the analysis of the other cases is available in GitHub [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In the analysis, all requirements
are organized according to the Analytics project phases depicted in Fig. 1 so that all phases can
be respected from the very beginning of the project.
      </p>
      <sec id="sec-3-1">
        <title>3.1 Descriptive analytic projects</title>
        <p>
          The form of analysis is demonstrated here on one of the descriptive analytics projects (Case 1)
“Students’ perceptions of a community health advocacy skills building activity: A descriptive
analysis” [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. The analysis produced the following results.
        </p>
        <p>Initiation phase:
Pro requirement 1.1: This analytics project must “explore students' perceptions of the benefits of
a discussion activity about a controversial health issue, and to describe the impact of the
opportunities to form valid arguments using empirical evidence on students' perceptions of their
ability to be advocates”.</p>
        <p>Gen requirement 1.1: An analytics project must have a clearly defined goal.</p>
        <p>Pro requirement 1.2: The methods used in this project will consist of “students were invited to
provide feedback on their perceptions of activity benefits. Descriptive analyses were conducted”.
Gen requirement 1.2: The analytics project must have a clearly defined strategy that will be used
to achieve the mentioned goal. The main point of emphasis is what type of data analytics will be
required to achieve the goal.</p>
        <p>Acquisition phase:
Pro requirement 1.3: This project will use “post assignment survey” and “included questions
asking how much the activity helped the student learn the following advocacy skills: (1) form a
valid argument using scientific evidence; (2) use credible sources when forming opinions; and (3)
begin to see themselves as advocates for improving the health of individuals and communities.”
Gen requirement 1.3: The data analytics project must have a source(s) of data and how it will be
collected.</p>
        <p>Analysis phase:
Pro requirement 1.4: The project will carry out descriptive analysis by using “Descriptive
statistics”.</p>
        <p>Gen requirement 1.4: The specific method(s) that will be used to carry out the analysis must be
selected.</p>
        <p>Pro requirement 1.5: The project will use IBM's “SPSS” software to conduct descriptive statistics.
Gen requirement 1.5: The software or tools that will be used to carry out the analysis must be
explicitly mentioned.</p>
        <p>Presentation phase:
Pro requirement 1.6: The insights provided by the data analytics project will be presented in the
form of a bar chart showing the “frequency distribution” for each of the responses by each
category of students (graduate or undergraduate).</p>
        <p>Gen requirement 1.6: If and how the findings of the data analysis must be visualized should be
defined.</p>
        <p>
          Four other cases analyzed are listed below:
• Alopecia areata: Descriptive analysis in a Brazilian sample [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] (Descriptive analytics
Case 2).
• Restaurant closures during the COVID-19 pandemic: A descriptive analysis [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
(Descriptive analytics Case 3).
• Descriptive Analytics using Visualization for Local Government Income in Indonesia [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]
(Descriptive analytics Case 4).
• Factors contributing to coronavirus disease 2019 vaccine hesitancy among healthcare
workers in Iran: A descriptive-analytical study [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. (Descriptive analytics Case 5, this case
was used for testing the applicability of the generic requirements).
        </p>
        <p>
          The results of the analysis of all descriptive analytics cases can be found in the file “Descriptive
analytics case studies” within GitHub [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>Through the analysis of Cases 1-4, the following generic requirements for descriptive analytics
projects Gen(des) denoting generic requirements that relate to descriptive analytics projects
were defined:
Initiation phase
1. Gen(des) requirement 1.1: An analytics project must have a clearly defined goal.
2. Gen(des) requirement 1.2: The analytics project must have a clearly defined strategy that
will be used to achieve the defined goal.
3. Gen(des) requirement 3.1: The level of detail and technicality required when describing
the analytics methods used within the project must be defined based on the knowledge level
of the client(s).</p>
        <p>Acquisition phase
4. Gen(des) requirement 1.3: The analytics project must have a source(s) of data and an
approach to how it will be collected.
5. Gen(des) requirement 2.1: The permissions regarding the use of incomplete data sets
must be defined within the context of the analytics project.
6. Gen(des) requirement 3.2: How data was collected from the data source must be defined
for an analytics project.
7. Gen(des) requirement 4.1: The data contained within the data source must be defined as
well as which of that data will be used for the data analytics.
8. Gen(des) requirement 4.2: Specification regarding the ETL (Extract Transform Load)
must be defined for analytics projects.
9. Gen(des) requirement 4.3: How data warehousing is carried out in the analytics project,
and then the specifications regarding the data warehouse must be specified.</p>
        <p>Analysis phase
10. Gen(des) requirement 1.4: The specific method(s) that will be used to carry out the
analysis should be selected.
11. Gen(des) requirement 1.5: The software or tools that will be used to carry out the analysis
must be explicitly defined.
12. Gen (des) requirement 3.3: To be used practices that relate to the reliability of analysis
results must be defined for the analytics project.</p>
        <p>Presentation phase
13. Gen(des) requirement 1.6: How the findings of the analysis must be visualized should be
defined.
14. Gen(des) requirement 4.4: How the findings of the analysis can be used must be defined.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 Diagnostic analytic projects</title>
        <p>
          Similarly, five diagnostic analytics projects were analyzed:
• Diagnostic Analysis for outlier detection in big data analysis [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] (Diagnostic analytics
Case 1).
• Diagnostic analysis of regional ozone pollution in Yangtze River Delta, China: A case study
in Summer 2020 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] (Diagnostic analytics Case 2).
• Mixed logit model-based diagnostic analysis of bicycle-vehicle crashed at daytime and
nighttime [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] (Diagnostic analytics Case 3).
• Diagnostic analysis of distributed input and parameter datasets in Mediterranean basin
streamflow modeling [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] (Diagnostic analytics Case 4).
        </p>
        <p>
          Diagnostic analysis of a single-cell Proton Exchange Membrane unitized regenerative fuel cell
using numerical simulation [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] ((Diagnostic analytics Case 5, this case was used for testing the
applicability of the derived generic requirements).
        </p>
        <p>
          The results of the case analysis can be found in the file “Diagnostics analytics case studies”
within GitHub [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Through the analysis of the descriptive analytics case (projects), the following
generic requirements for diagnostic analytics projects Gen(dia) denoting generic requirements
for diagnostic analytics projects were defined:
Initiation phase:
1. Gen(dia) requirement 1.1: The analytics project must have a clearly defined goal.
2. Gen(dia) requirement 1.2: The analyst must be aware of the level of expertise of the client
and define the key terminology within the context of the data analytics project accordingly.
3. Gen(dia) requirement 1.3: The analytics project must have a clearly defined
system/object on which the diagnostic analysis is carried out.
4. Gen(dia) requirement 1.4: The analytics project must have a quantitative metric(s) that
is used to evaluate the system.
5. Gen(dia) requirement 4.1: The analytics project must define how the results of the data
analytics will be validated or verified.
        </p>
        <p>Acquisition phase:
6. Gen(dia) requirement 1.5: The analytics project must have defined which data sets will
be used and where these data sets will be acquired.
7. Gen(dia) requirement 1.6: The properties of the dataset that is used within the analytics
project must be defined.
8. Gen(dia) requirement 2.1: The model(s) used within the analytics project, along with
what said models are used for, must be defined for a diagnostic analytics project.
9. Gen(dia) requirement 4.2: The derived data used in the analytics project must be defined.
Analysis phase:
10. Gen(dia) requirement 1.7: The ‘strategy’ that will be employed to carry out the diagnostic
analysis must be defined.
11. Gen(dia) requirement 1.8: An analytics project must have an in-depth definition of the
strategy that will be used to carry out the analytics. This includes specific equations that will
be used and what the variables within said equations are.
12. Gen(dia) requirements 2.2: The analytics project must have clearly defined tools that are
going to be used and what those tools will be used for.</p>
        <p>Presentation phase:
13. Gen(dia) requirements 1.9: The graphical representations that are required when
presenting the diagnostic results must be defined.
14. Gen(dia) requirements 1.10: The format in which the results of a diagnostic project are
textually presented must be defined.
15. Gen(dia) requirement 3.1: The format by which the different causes of the issue must be
categorized, must clearly be stated when presenting the results of the analytics project.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Definition of requirements template based on analytics project attributes</title>
      <p>To create a requirements template that might apply to any analytics project (starting with
descriptive and diagnostic analytics projects), the generic requirements that were defined
through the case analysis were integrated based on what project attributes they were trying to
define. This was done by assessing what analytics project attribute values (see Fig. 1) could be
defined from the information provided in the project requirement that was used to initially derive
the generic requirement. If the project requirement could be used to define the value of a specific
analytics project attribute, the generic requirement that is associated with the project
requirement was assigned to the specific analytics project attribute. The results were
amalgamated in the model shown in Fig. 2, which consists of three levels:
• the concept of an Analytics Project,
• the dissection of the analytics project attributes based on the project phases,
• and the analytics project attributes that relate to each of the project phases. The final level
also contains what project attribute can be defined using which generic (Gen(des) or Gen(dia))
requirements.</p>
      <p>The model is used as a frame which helps to see how different attributes are supported by the
cases analyzed. At this stage, only four base cases and one testing case were used for two types of
analytics. When reviewing more analytics cases, it will be possible to see whether new attributes
should be included in the model. As the model systematically allows us to keep track of knowledge
developments reported in scientific research, it can be used as the base for developing further
artifacts.</p>
      <p>Based on the analytics project model for generic requirements (see Fig. 2), the requirements
template for generic requirements of descriptive and diagnostic analytics projects shown in Fig.
3 was defined. The requirements template consists of the following types of requirements:
• Gen (Generic) requirements (relevant for both types of analytics) and
• Sec (Secondary) requirements that point to the conditional requirements needed to be
defined within an analytics project.</p>
      <p>The phases to which the requirements relate are defined by text in blue color and italics. In the
brackets after a requirement, the references to initial generic requirements are given. Grey color
is used to depict requirements, which are, by their nature, Generic requirements, however, their
application depends on conditional (Secondary) requirements, i.e., they become a part of the list
of generic requirements if it is required by conditions specified in Secondary requirements which
refer to different types of analytics. With such approach it is possible to have one template;
instead of using a separate template for each type of analytics (in our case, descriptive and
diagnostic analytics).</p>
      <p>Figure 2: Analytics project model for generic requirements</p>
      <p>Requirements Template for Analytics Project
Initiation:
Gen requirement 1: An analytics project must have a clearly defined goal. (Gen Requirement
1 is based on Gen(des) 1.1 and Gen(dia) 1.1)
Gen requirement 2: The data analyst must be aware of the level of expertise of the client and
define the key terminology within the context of the data analytics project accordingly. (Gen
Requirement 2 is based on Gen(dia) 1.2 and Gen(des) 3.1)
Gen requirement 3: The analytics project must have a clearly defined system/object on which
the analysis is being carried out. (Gen requirement 3 is based on Gen(dia) 1.3)
Sec requirement 1: If the project is carrying out diagnostic analytics, then define Gen
requirement 4.</p>
      <p>Gen requirement 4: The analytics project must have a quantitative metric(s) that is used to
evaluate the system. (Gen requirement 4 and Sec requirement 1 are based on Gen(dia) 1.4)
Sec requirement 2: If the results of the analytics project must be validated, then define
requirement 5.</p>
      <p>Gen requirement 5: The analytics project must have defined how the results of the data
analysis will be validated or verified. (Gen requirement 5 is based on Gen(dia) 4.1)
Acquisition:
Gen requirement 6: The analytics project must have defined dataset(s) that will be used to
carry out the analysis, as well as what data is contained within the defined dataset(s). (Gen
requirement 6 is based on Gen(dia) 1.5 and Gen(dia) 1.6)
Gen requirement 7: The source(s) from which the data (including derived data) is acquired
must be specified, alongside how data will be acquired from said source(s). (Gen requirement
7 is based on: Gen(des) 1.3, Gen(des) 4.2, Gen(des) 4.1, and Gen(dia) 4.2)
Gen requirement 8: Procedures regarding the use of incomplete data sets must be defined.
(Gen requirement 8 is based on Gen(des) 2.1)
Sec requirement 3: If this analytics project will use models, then define Gen requirement 9.
Gen requirement 9: The preexisting analytical model(s) used within the analytics project must
be defined alongside the source of said model and its utility within the analytics project. (Gen
requirement 9 and Sec requirement 3 are based on Gen(dia) 2.1)
Analysis:
Gen requirement 10: The approach (defined by a single term or short phrase such as
“histogram-based visualization”) that will be used to carry out the analysis must be defined.
(Gen requirement 10 is based on Gen(des) 1.4 and Gen(dia) 1.7)
Gen requirement 11: The specifics that relate to the analysis approach such as mathematical
equations, algorithms, and analytics techniques must be defined. (Gen requirement 11 is based
on Gen(dia) 1.8)
Gen requirement 12: The tools/software used to carry out the analysis must be defined for the
analytics project. (Gen requirement 12 is based on Gen(dia) 2.2 and Gen(des) 1.5)
Gen requirement 13: Practices that relate to the reliability of analysis results must be defined
for the analytics project. (Gen requirement 13 is based on Gen(des) 3.3)
Presentation:
Gen requirement 14: Specifications regarding the visualization(s) that must be produced as
the result of the project must be defined. (Gen requirement 14 is based on Gen (dia) 1.9 and
Gen(des) 1.6)
Gen requirement 15: The specifications relating to formatting preferences of the textual report
showing the results of the analytics project must be defined for the analytics project. (Gen
requirements 15 is based on: Gen(des) 1.10, Gen(dia) 3.1, Gen(des) 3.1, and Gen(des) 1.2)</p>
      <p>This template can be used as a checklist of what requirements need to be defined for a data
analytics project to ensure that the main issues that relate to different aspects(phases) of the data
analytics project are defined. This framework will allow for personnel involved in analytics
projects to define a minimum set of requirements that relate to the project.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Application of the requirements template within the analytics project cases</title>
      <p>One of the projects that was used to see the practical usefulness of the created template was a
multi-contextual real estate analytics project. The goal of this project was to undertake research
that results in the development of an analytics solution for parties that have a stake in the real
estate industry such as real estate management companies, banks, or local governments. The
progress of the project so far consisted of a single brainstorming session where the researchers
and stakeholders from a real estate management company discussed the project. Therefore, it
was possible to conclude that the project was still in incubation.</p>
      <p>A project stakeholder was contacted to elicit the requirements that satisfy the generic
requirements template. A total of 13 requirements were defined for the multi-contextual real
estate analytics project with a time investment from the stakeholder of at most 70 minutes that
enabled the data analyst to use the requirements template to define the goal of the project and
potential data sources, as well as to define specifications regarding the end product which, in this
case, was a BI (Business Intelligence) dashboard, and choose potential analysis techniques that
apply to the analytics project.</p>
      <p>While the template allowed the definition of 13 previously unstated requirements, the
stakeholder pointed out that he would expect much more specific requirements to be defined.
Therefore, from this evaluation case, we can see that the generic requirements template proposed
in this paper can serve only as a starting point in the requirements definition. It is a question of
further research whether it is possible to define generic requirements templates for more specific
requirements.</p>
      <p>The requirements template was also applied within the context of a multi-disciplinary
research project conducted by a team of undergraduate and graduate students that undertook
the development of a classification model that relates to the human gut microbiome. The
requirements template aided with the definition of the datasets that could be utilized to train and
test the model as well as in defining the various data sources that could be utilized to create the
dataset. It also proved useful when conveying the details that relate to the project such as what
analysis techniques, and data sources would be utilized within the project. Given that this
information was explicitly defined during the project, points to the potential utility of the generic
requirements template as a tool for conveying the processes undertaken within the context of an
analytics research project.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and future works</title>
      <p>The goal of this research was to analyze descriptive and diagnostic analytics case studies to
develop a generic requirements template that would apply to all descriptive and diagnostic
analytics projects. The defined template was then applied to two real analytics projects and
yielded the sets of requirements useful in the early stages of these projects. The requirements
template allowed the users to produce (in a relatively short time) the initial set of requirements
for the data analytics project, where all further data analytics phases and relevant analytics
project attributes were respected. Consideration of the predefined set of attributes at the
beginning of the project helps to reduce the project complexity (the attributes are given and
should not be discovered from scratch) and, consequently, the time needed for requirements
definition.</p>
      <p>Further research can expand on the proposed generic requirements template by analyzing
more cases including those that relate to prescriptive and predictive analytics, as well as applying
the defined requirements template in other project cases to better understand the utility of the
template and to better test its viability. Also, it might be investigated whether generic
requirements templates can be defined for obtaining more detailed requirements.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Balali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nouri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nasiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Data Analytics, in: Data Intensive Industrial Asset Management</article-title>
          .,
          <year>2020</year>
          , Springer, Cham.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Azham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. O. C.</given-names>
            <surname>Mkpojiogu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Kamal</surname>
          </string-name>
          ,
          <article-title>The role of requirements in the success or failure of software projects</article-title>
          .
          <source>International Review of Management and Marketing 6</source>
          .7 (
          <year>2016</year>
          ):
          <fpage>306</fpage>
          -
          <lpage>311</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3] ISO/IEC/IEEE International Standard
          <article-title>- Systems and software engineering - Life cycle processes - Requirements engineering</article-title>
          , in ISO/IEC/IEEE 29148:
          <year>2018</year>
          (E), pp.
          <fpage>1</fpage>
          -
          <issue>104</issue>
          , 30 Nov.
          <year>2018</year>
          , doi:10.1109/IEEESTD.
          <year>2018</year>
          .
          <volume>8559686</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Fawcett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Provost</surname>
          </string-name>
          , Data Science for Business. (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Runkler</surname>
          </string-name>
          , Data Analytics.,
          <year>2020</year>
          , Springer EBooks. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>658</fpage>
          - 29779-4.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Olayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Matsuno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yamamoto</surname>
          </string-name>
          ,
          <source>A Dependability Assurance Method Based on Data Flow Diagram (DFD)</source>
          ,
          <source>2013 European Modelling Symposium</source>
          , Manchester, UK,
          <year>2013</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>118</lpage>
          , doi:10.1109/EMS.
          <year>2013</year>
          .
          <volume>20</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumarasinghe</surname>
          </string-name>
          ,
          <article-title>Supplementary material for research on data analytics project requirements template (Version 2</article-title>
          .0.4) [Computer software],
          <year>2023</year>
          . URL: https://github.com/ArithaRTU/Supplementary-Material
          <string-name>
            <surname>-</surname>
          </string-name>
          for-Research-RegardingRequirements-Template.git.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hardin-Fanning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Hartson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Galloway</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Gesler</surname>
          </string-name>
          ,
          <article-title>Students' perceptions of a community health advocacy skills building activity: A descriptive analysis</article-title>
          .
          <source>Nurse Education Today</source>
          ,
          <year>2023</year>
          ,
          <volume>120</volume>
          ,
          <fpage>105627</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ridzuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. M. N. W.</given-names>
            <surname>Zainon</surname>
          </string-name>
          ,
          <article-title>Diagnostic analysis for outlier detection in big data analytics</article-title>
          ,
          <source>Procedia Computer Science</source>
          ,
          <year>2022</year>
          , volume
          <volume>197</volume>
          , pp.
          <fpage>685</fpage>
          -
          <lpage>692</lpage>
          , ISSN 1877-
          <volume>0509</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A. S. D. A.</given-names>
            <surname>Lopes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. D. N.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D. C. Razé</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Lazzarini</surname>
          </string-name>
          ,
          <article-title>Alopecia areata: descriptive analysis in a Brazilian sample</article-title>
          . Anais Brasileiros de Dermatologia,
          <volume>97</volume>
          , pp.
          <fpage>654</fpage>
          -
          <lpage>656</lpage>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Sedov</surname>
          </string-name>
          ,
          <article-title>Restaurant closures during the COVID-19 pandemic: A descriptive analysis</article-title>
          .
          <source>Economics Letters., Apr</source>
          <volume>1</volume>
          ;
          <fpage>213</fpage>
          :
          <fpage>110380</fpage>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Irzavika</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Supangkat</surname>
          </string-name>
          ,
          <article-title>Descriptive analytics using visualization for local government income in Indonesia. 2018 International Conference on ICT for Smart Society (ICISS)</article-title>
          .
          <source>IEEE</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dinmohammadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohammadi</surname>
          </string-name>
          ,
          <article-title>Factors contributing to coronavirus disease 2019 vaccine hesitancy among healthcare workers in Iran: A descriptive-analytical study</article-title>
          .
          <source>Clinical Epidemiology and Global Health</source>
          ,
          <year>2020</year>
          ,
          <volume>18</volume>
          ,
          <fpage>101182</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>Diagnostic analysis of regional ozone pollution in Yangtze River Delta, China: A case study in summer 2020</article-title>
          .
          <source>Science of The Total Environment</source>
          ,
          <year>2022</year>
          ,
          <volume>812</volume>
          ,
          <fpage>151511</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. D.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <article-title>Mixed logit model based diagnostic analysis of bicycle-vehicle crashes at daytime and nighttime</article-title>
          .
          <source>International Journal of Transportation Science and Technology</source>
          ,
          <year>2022</year>
          ,
          <volume>11</volume>
          (
          <issue>4</issue>
          ),
          <fpage>738</fpage>
          -
          <lpage>751</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Milella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Bisantino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gentile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Iacobellis</surname>
          </string-name>
          , G.T. Liuzzi,
          <article-title>Diagnostic analysis of distributed input and parameter datasets in Mediterranean basin streamflow modeling</article-title>
          .
          <source>Journal of Hydrology</source>
          ,
          <year>2012</year>
          ,
          <volume>472</volume>
          ,
          <fpage>262</fpage>
          -
          <lpage>276</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Arif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.C.</given-names>
            <surname>Cheung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Andrews</surname>
          </string-name>
          ,
          <article-title>Diagnostic analysis of a single-cell Proton Exchange Membrane unitized regenerative fuel cell using numerical simulation</article-title>
          .
          <source>International Journal of Hydrogen Energy</source>
          ,
          <year>2021</year>
          ,
          <volume>46</volume>
          (
          <issue>57</issue>
          ),
          <fpage>29488</fpage>
          -
          <lpage>29500</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>