<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Parameter Settings Optimization in MapReduce Big Data processing using the MOPSO Algorithm</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nairobi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kenya</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Jomo Kenyatta University of Agriculture and Technology</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Nairobi</institution>
          ,
          <country country="KE">Kenya</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Computing and Information Technology</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>-Big data is a commodity which is highly valued in the entire globe. It is not just regarded as data but in the world of experts, we can derive intelligence from it. Because of it's characteristics which are Variety, Value, Volume, Velocity and the growing need of how it can be handled, organizations are facing difficulties in ensuring optimal as well as affordable processing and storage of large datasets. One of the already existing technology used for rapid processing together with storage in big data is known as Hadoop MapReduce. MapReduce is used in parallel and distributed computing environment for large scale data processing whereas Hadoop on the other hand is for running applications and storage of data in clusters of commodity hardware. Furthermore, Hadoop Mapreduce framework needs to tune more than 190 configuration parameters which are mostly done manually. Due to complex interactions and large spaces between parameters, manual tuning is not effective. Even worse, these parameters have to be tuned each and every time Hadoop MapReduce applications are run. The main objective of this research is to develop an algorithm which shall enhance performance by automatically optimizing parameter settings while MapReduce jobs are running. The algorithm employs Multi Objective Particle Swarm Optimization (MOPSO) concept which makes use of two objective functions in order to perform optimization in the parameters by searching for a pareto optimal solution. The results of the experiments have shown that the algorithm has remarkably improved MapReduce job performance in comparison to the use of default settings. Index Terms-Multi Objective Problems, MOPSO, PITCH, MapReduce, Pareto Optimality, Parallel Computing Toolbox.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Nairobi,Kenya etyang.lennah@students.jkuat.ac.ke</title>
      <p>I. INTRODUCTION</p>
      <p>
        Volumes of data originating from variety of source
documents have shown a significant growth. For big data, there
is need to have various techniques to enable one process
and store it as it grows over a period of time considering
its characteristics[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Inspite of many organizations knowing
      </p>
    </sec>
    <sec id="sec-2">
      <title>Lawrence Nderu</title>
      <p>
        the values and benefits of this data as well as gaining more
access to it, the challenge of ensuring methods and techniques
which provide affordable storage and processing of big data
is something they are struggling with[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This is due to the
expanding demand in being able to handle data which is
growing overtime[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Hadoop MapReduce is a computing technology used in
processing and storing big data[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It has capabilities of
carrying out parallel and distributed processing and storage
on different clusters in a manner which ensures scalability
and fault tolerance. Even with Hadoop’s more than 190
configuration parameters which need to be set in order to
facilitate execution of jobs, the main challenge is that most
users are not even aware of these parameters. Due to lack
of appropriate skills, they cannot be able to gain access
to these important configuration options. Moreover, if these
parameters are not assigned appropriate values, the system
automatically picks the default values and this leads to high
consumption and ineffective use of the computing resources
hence making it difficult for Hadoop MapReduce to achieve
optimum performance. Furthermore, getting an algorithm that
can develop a multi objective function through correlating
configuration parameters has been extremely difficult since there
is complexity in a way parameter settings are interconnected.
      </p>
      <p>
        For this reason, we developed PITCH to automatically tune
parameter settings while MapReduce jobs are running[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].This
algorithm employed the MOPSO concept which uses more
than one condition to decide and conclude the best
solution[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. PITCH has proved to be very powerful in dealing
with multi objective problems in the best way possible as well
as providing a set of good solutions putting into consideration
pareto optimality[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. After running several experiments, our
results showed a significant improvement in the execution time
of MapReduce jobs as compared to the use of the default
settings.
      </p>
      <p>Section II of this paper gives us a conceptual overview of
Hadoop and MapReduce, section III provides a discussion
on other related work while section IV to VI shows our
contribution. Section VII and VIII is the discussions together
with recommendations and section IX provides the conclusion
and the future work.</p>
      <p>II. CONCEPTUAL OVERVIEW OF HADOOP MAPREDUCE
Hadoop has got two key parts; The Hadoop Distributed File
System (HDFS) and MapReduce. MapReduce runs on HDFS
that is scalable and parallel. It involves two tasks which are
distinct and they are map and reduce tasks done by Hadoop
jobs. The input dataset is taken by the map task and then given
to the pairs in between for sorting and partitioning depending
on the reducer. Each and every output of the job is then moved
into the reducer in order to generate ultimate outputs.</p>
      <sec id="sec-2-1">
        <title>A. Hadoop Architecture</title>
        <p>HDFS plays a main role in storage due to its reliability and
fault tolerance feature. It has the ability perform configuration
on the block replications to ensure data is well protected
and recovery mechanisms are available in scalable and fault
tolerant situations. The most important Hadoop modules are
job tacker together with task tracker. The job tracker is a
master and its main role is to ensure user jobs which are
taken are split into smaller tasks. It also facilitates scheduling
of tasks ensuring those which do not make it are re-executed
again. Jobs are then assigned task trackers in clusters of the
nodes. The task tracker’s role is to process the nodes used
to run MapReduce tasks. It then sends a set of messages to
the job tracker which carries information regarding the status
of how the running tasks are and which slots are available.
“Fig. 1” shows the Hadoop architecture.</p>
      </sec>
      <sec id="sec-2-2">
        <title>B. MapReduce Architecture</title>
        <p>The entire map/reduce process does its operations using
three phases.</p>
        <p>1) Mapping: Being the first phase in MapReduce program
execution, this is the phase whereby data is taken to the map
function then values are gotten in the output.</p>
        <p>2) Shuffling: Shuffling phase takes in the output from the
mapping phase. This involves putting together relevant values
from the Map output.</p>
        <p>3) Reducing: The phase consolidates all the output values
which come from the shuffling phase. The values are then
put together as a summarized dataset which are returned as a
complete output as shown in “Fig. 2” .</p>
        <p>In recent years, research has been done for the purpose
of Hadoop MapReduce performance optimization.
Methodologies used by existing studies are varying from optimization of
parameter values to scheduling of the jobs. Load balancing
and data locality is also an area where focus has been put on.</p>
        <p>
          Mansaf Alam[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]did research on Hadoop framework for
big data analytics in the cloud environment. They outlined
that even if the rate of growth in big data is fast growing,
management of the same data is challenging due to its
characteristics which are varying. Their work involved classification
of input data then routing this data to several nodes for
processing. When processing of each node was completed,
they put together the output of every node to get the final
result. Hadoop in this case was used to facilitate partitioning
and processing of data.
        </p>
        <p>
          Nikhil Rajyaguru[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]understood, compared and found the
difference between all the big data technologies which were in
use by mobile applications. MapReduce framework was used
to compare these technologies while using test cases in already
ongoing research. The result of their research showed that big
data tools were usable in computers only but not in mobile
devices.
        </p>
        <p>
          Yang[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]developed a model for big data processing in a
parallel computing environment. He described the process of
carrying out performance in MapReduce jobs in the cloud
together with writing mapper and reducer classes by use of
objects. The framework was able to handle large datasets but
it was not able to show its tedious details and complexities
such as scalability.
        </p>
        <p>
          Samira Daneshyar[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]provided a systematic analysis and a
comprehensive review in processing large datasets. In the same
case, they described challenges encountered in data handling
and all the computing requirements using Hadoop MapReduce.
They showed all the requirements which make it possible for
MapReduce to process big data and they finally demonstrated
in an experiment how MapReduce can be run in the cloud.
        </p>
        <p>
          Voruganti[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]stated that Hadoop framework can work well
when the key/value pairs are used. The data which is less
structured is able to handle quite a number of challenges in
different problem domains. Their results showed that there are
many forms of data which can be successfully analyzed after
being transformed to key/value pairs.
        </p>
        <p>
          Guangdeng Liao[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]performed an evaluation on various
approaches which enabled automatic parameter tuning
together with machine learning models which were based on
costs. After establishing that existing models were not
adequate, they came up with Gunther algorithm for the purpose
of optimization. Gunther was evaluated in different clusters
which had distinct resources. The results of their experiment
demonstrated that in a small number of trials, Gunther attained
a near optimal performance.
        </p>
        <p>
          Min Li[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]developed MRONLINE used for online
performance. MRONLINE’s job was to monitor job execution and
to do parameter tuning in terms of data which was collected.
In MRONLINE, every task had configurations not similar
as compared to other research works which used the same
configurations. An algorithm was used to converge into
nearoptimal configurations. The results of their work demonstrated
how effectively MRONLINE had improved performance up to
about 30 percent as compared to default configuration settings.
        </p>
        <p>
          M. Khan, Y. J[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]developed a Gene Expression
Programming(GEP) model representing a correlation in the parameters
to be configured which focused on previous records of jobs
run in Hadoop. The model employed the Particle Swarm
Optimization(PSO) method that made use of objective function
in order to get parameter settings which are optimal or closer
to optimal. In their results, they demonstrated that the work
greatly improved Hadoop’s performance compared to default
settings.
        </p>
        <p>
          All the existing work has shown that changes in parameter
values are done either before or after Hadoop MapReduce
jobs are executed. Also the MOPSO concept has never been
used for parameter tuning in big data processing. For this
reason, we proposed PITCH, an algorithm which used the
MOPSO concept to develop a multi objective function in
order to optimize parameter settings while MapReduce jobs
are running. MOPSO is emerging due to its usefulness in
optimizing more than one objective function at the same time.
In MOPSO instead of providing a single solution, a set of
solutions known as pareto optimal sets are determined[
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>IV. ENHANCING MOPSO</title>
      <p>
        This section describes how PITCH employed the MOPSO
concept in MapReduce parameter settings optimization.
MOPSO, which is used to solve Multi Objective Problems
was proposed by Coello Coello in 2004[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. It is an extension
of the PSO algorithm as introduced by Kennedy and Eberhart
in 1995[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. PSO was inspired by a social complex behavior
of a flock of birds or a school of fish through simple actions
of each member of the flock flying and performing repetitive
tasks hence achieving a higher level of intelligence[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Swarm
intelligence has been utilized through multi dimensional search
spaces in order to find an optimum solution therefore
effectively solving single objective problems[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. However, PSO
can only get a global optimum solution since the particles in
a swarm have the same flying experience [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. The algorithm’s
inability to solve multi objective problems is the reason why
it was modified to a standard MOPSO.
      </p>
      <sec id="sec-3-1">
        <title>A. PITCH algorithm</title>
        <p>
          Multi Objective Optimization Problems (MOOP) have
multiple objectives containing constraints which need to be
satisfied using feasible solutions[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]found using the pareto
optimality theory[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. (a) shows PITCH algorithm in a
Pseudocode:
(a)
        </p>
        <p>MOOP problems having a number of objectives together
with a number of equality and inequality constraints are
formulated in “(1)”:
minimize</p>
        <p>f n(X)
subject to
n = 0; 1; 2; 3; : : : ; N
hk(x) = 0;</p>
        <p>k = 0; 1; 2; 3; : : : ; K;
gl(x) &lt;= 0; j = 0; 1; 2; 3; : : : ; L</p>
        <p>(1)
Where: fn(X) is the Objective function, x is a decision vector
that represents solutions, N are a number of objectives, K is the
number of equality constraints and L is number of inequality
constraints.</p>
        <p>
          In such a situation, the goal is to optimize n objective
functions at the same time with the end result being good
compromise of the solutions which represent the better
tradeoff between objectives[
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. The choice of parameters can
highly impact the optimization performance therefore in order
to implement PITCH, we first have to randomly initialize the
positions of the particles in the search space to ensure uniform
coverage as well as initializing velocity to zero as shown in
“(2)”.
        </p>
        <p>Vit+1= wvit + c1r1(pbestit Xit)c2r2(gbestit Xit)</p>
        <p>(2)</p>
        <p>In every iteration, PITCH is updating the current position
vector of the swarm by adding velocity ’V’ using “(3)”:</p>
        <p>Xt+1= Xit + Vit+1
i
(3)
and for the functions to run properly, they must be able to
accomplish some specific requirements. In the map phase,
three input functions used are data, info, and intermKVStore.
data and info are the output of the call which reads the function
from the input datastore, whereby MapReduce execution is
automatically done before every call to the map function. The
intermKVStore object, the Intermediate Key Value Store is
where the map function adds the key-value pairs. A simple
example of a map function in (b) is:
(b)</p>
        <p>Where: r1 and r2 are cognitive and social randomization
parameters which have random values between 0 and 1. c1
and c2 are local and global weights respectively which are
acceleration constants. v denotes particle i’s velocity and
position. w is the inertia weight and t is an absolute time
index. pbest is a personal archive which has the most recent
non-dominated positions encountered by a particle in the past.
gbestti the global best swarm flying experience is selected
from an external archive. vit is the current external archive
which stores non-dominated solutions known as pareto optimal
sets. vit+1 is a new external archive which is then obtained
from the current external archive vit and the population Pt.</p>
        <p>The constriction factors K and L eliminate clamp speed
therefore promoting convergence and these factors are
expressed using the “(4)” and “(5)”</p>
        <p>The first line is used to filter all the NaN values in a block
of data. The second line’s role is to create a vector which
has two elements having a total distance besides the count of
the block. The third line adds the vector of all the values in
to the intermKVStore with the key, ’sumAndLength’. After
executing this map function on all the data block d in ds,
then the intermKVStore object will contain the total count for
every block of data. Inputs of the reduce functions are the
intermKey, intermValIter, and outKVStore. intermKey is used
for the active key which is added by the map function. Every
call made by MapReduce to the reduce function is used to
specify a new unique key from all the keys in the intermediate
KeyValueStore object. The intermValIter is the ValueIterator
which associates with the intermKey. The object also contains
Vit+1= k*wvit + c1r1(pbestit Xit) + c2r2(gbestit Xit) all the values in the active key. Finally, the outKVStore, the
(4) final KeyValueStore is the object which the reduce function
Vit+1= l*wvit + c1r1(pbestit Xit) + c2r2(gbestit Xit) uses to add the key-value pairs. The output is then returned to
(5) the output datastore. A simple example of a reduce function</p>
        <p>
          Just like the PSO algorithm, particles in PITCH share in (c) :
information and move towards the global best as well as their
personal best memory. What makes it different from PSO is the
use of more than one condition to define and conclude the best
solution as either global or local[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. In this case, all the
nondominated particles in the swarm are grouped together into a (c)
repository which is geographically based in order to maintain
diversity[
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. It is in the repository that every particle selects
its global best objective amongst its members[
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. A personal
best particle is selected based on the level of domination as
well as probabilistic rules since each particle has got its own
swarm flying experience.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>B. PITCH implementation in MapReduce</title>
        <p>
          A popular implementation of MapReduce is the Hadoop
MapReduce which works with HDFS[
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]. However,
MATLAB also offers this implementation using the MapReduce
function[
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]. It makes use of a data store to process data in
small blocks that fit into memory individually. Every block of
code goes through the map phase, whereby data is formatted
for processing then the intermediary data blocks go through
the reduce phase, which puts together all the intermediate
results in order to produce a final output[
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]. The MapReduce
functions are automatically called at the execution time[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
The reduce function runs as a loop through every distance
beside count values of the intermValIter. It keeps the running
total distance together with count after every pass[
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. When
the loop is complete, the reduce function then calculates the
overall mean using a simple division, then finally adds a single
key to outKVStore.
        </p>
        <p>
          One main advantage MapReduce is its ability to be extended
for it to run in a number of computing environments[
          <xref ref-type="bibr" rid="ref24">24</xref>
          ].
However, the ability of MapReduce to perform calculations on
large collections of data makes it not suitable for performing
calculations on data sets which can be loaded directly into
computer memory and analyzed with traditional techniques[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
For this reason, it is important to use MapReduce for
performing statistical or analytical calculations on datasets which do
not fit in memory[
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]. The mapreducer configuration function
is used to change the execution environment using a Parallel
Computing Toolbox, the MATLAB Parallel Server as well as
the MATLAB Compiler[
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]. This enables one to first start
small by verifying the map and reduce functions before scaling
up to run in larger computations.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>C. Running MapReduce application in Parallel</title>
        <p>
          The Parallel Computing Toolbox helps in speeding up
MapReduce job execution using the full processing power of
multicore computers in order to execute applications using
a parallel pool of workers. The MATLAB Parallel Server
enables one to run the same applications on remote computer
clusters such as Hadoop using tall arrays which allow one
to run big data applications that do not fit into the computer
memory[
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. MATLAB compiler on the other hand enables
one to create standalone MapReduce applications or
deployable archives, which can be shared between applications or
moved to production in Hadoop systems[
          <xref ref-type="bibr" rid="ref28">28</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>V. PITCH EXPERIMENT SETUP</title>
      <p>In this section, we are presenting how the implementation
was and some of the experiments which were performed,
showing outcomes of the MapReduce job execution using
PITCH algorithm.</p>
      <sec id="sec-4-1">
        <title>A. Environment</title>
        <p>Experiments were run on a HP super computer, an Intel(R)
Core(TM) i5-8250U CPU @ 1.60 GHz 1.80 GHz with 8 GB
of memory. In a serial environment, we run the experiment in
a local machine while in a parallel experiment we used the
MATLAB parallel Computing Server. The MATLAB Parallel
Computing Toolbox changed the execution environment into
a cluster of four nodes which scaled up the performance to
run in larger computations. Both serial and parallel executions
were run using MATLAB R2020a application.</p>
      </sec>
      <sec id="sec-4-2">
        <title>B. performance evaluation</title>
        <p>
          For performance evaluation of the algorithm, we used the
ZDT benchmark function[
          <xref ref-type="bibr" rid="ref30">30</xref>
          ], a test suite which got its name
from its authors Zitzler, Deb and Thiele[
          <xref ref-type="bibr" rid="ref29">29</xref>
          ].
        </p>
        <p>This algorithm shared the code for the particles movement
and for performing evaluation in order to determine a tradeoff
between two of our objectives which are reducing the quantity
of the Random Access Memory(RAM) allocated to every task
and attaining a better Input/Output(I/O) performance hence
improving the execution time of every MapReduce job.</p>
        <p>Starting with problem definition, we came up with the
objective function together with the constraints to be minimized.
The constraints in our case are the MapReduce configuration
parameters which describe the number of decision variables.
These are the I/O location of jobs, the I/O format of data as
well as the classes which contain map and reduce functions.
The motivation for the selected parameter values is based on
the running records of MapReduce jobs, that can be either
memory intensive or I/O intensive. In order to optimize the
problem, we determined the tradeoff between two objectives
which in or case are the quantity of RAM and the I/O
Performance using the following techniques:
(d)
(e)</p>
        <p>We set memory to be maximum and found a single
objective minimal I/O performance over the designs which
satisfy the maximum memory constraint.</p>
        <p>We set I/O performance to be maximum and also found a
single objective minimal memory over the designs which
satisfy the I/O performance constraint.</p>
        <p>We then solved a multi objective problem, visualizing the
tradeoff between the two objectives using the
minimization function in (d).</p>
        <p>We equated the decision variables to 10, with a matrix size
of 1 having lower bound and upper bound variables of 0 and
1 respectively. We also set 100 as the particle swarm size, 200
for the number of iterations and a repository size of 50. The
values of c1 and c2 were set to 1.4269, the value of w was set
to 0.49, and the values of r1 and r2 were selected randomly
between 0 and 1 in every iteration.</p>
        <p>The parameters used in PITH algorithm are presented in
“Table I”</p>
        <p>In every iteration, the new position of a particle was
calculated using the number of unknown variables. The local
best values were compared with the new fitness values within
the repository then updated accordingly. In the same manner,
the global best position was updated. The section of the code
below in (e) is the algorithm’s main loop.</p>
        <p>Parameters
Number of decision variables
Matrix size
Lower bound decision variables
Upper bound decision variables
Number of particles in the swarm
Number of iterations
Repository size
C1 and C2
Inertia weight
r1 and r2</p>
      </sec>
      <sec id="sec-4-3">
        <title>C. Movement of particles in a pareto search</title>
        <p>“Fig. 3” and “Fig. 4” show how the particles of PITCH
algorithm moved in the search space towards the pareto
optimal solution considering the number of iterations:</p>
        <p>
          It is well known that multi objective algorithms have some
limitations in relation to convergence and diversity[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. In
order to avoid these problems, we used the archiving process
which introduced more convergence and the leader’s selection
method which provided diversity to the search. It used the
procedure that limited the velocity of each particle by equality
and inequality constraints k and l [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] which varies depending
on the values of C1 and C2. By doing this, the algorithm
introduced more convergence towards the pareto front guiding
the solution to a precise area within the repository. When the
repository became full, the best point among all the solutions
in the repository was obtained by determining the domination,
adding the non-dominated members into the new repository
thereby selecting a new solution. Minimization was obtained
through differential evolution[
          <xref ref-type="bibr" rid="ref32">32</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>VI. EXPERIMENTAL RESULTS OF PITCH</title>
      <sec id="sec-5-1">
        <title>A. experiment 1</title>
        <p>In the first experiment, MapReduce application was
executed in a serial machine using default parameter settings.
For this experiment, the execution time was 153 seconds.
MapReduce application was then executed in the PITCH
algorithm using Parallel Computing Toolbox expending two
iterations consecutively. From the output, execution time was
24 seconds in the first iteration and 18 seconds in the second
iteration.</p>
      </sec>
      <sec id="sec-5-2">
        <title>B. Experiment 2</title>
        <p>In “Fig. 5”, PITCH was executed using the following
settings: Number of particles in the swarm: 125, 150, 175 and
200. Fixed parameters: 200 iterations, 10 decision variables,
50 repository members.</p>
        <p>From the output, it can be seen that there is a gradual
reduction in the execution time from 26.9 to 23.7 seconds.</p>
      </sec>
      <sec id="sec-5-3">
        <title>C. Experiment 3</title>
        <p>In “Fig. 6”, we ran the implementation using the following
settings: Number of members in the repository: 20, 30, 40 and
50. Fixed parameters: 200 iterations, 10 decision variables and
100 particles in the swarm.</p>
        <p>Fig. 6. Reducing the repository size</p>
        <p>It can be seen from experiment 3 that there is still a gradual
reduction of the execution time from 37.6 to 25.3 seconds.
we would also increase the hardware and other resources, the
implementation can comparatively scale well.</p>
      </sec>
      <sec id="sec-5-4">
        <title>D. Experiment 4</title>
        <p>In “Fig. 7”, PITCH was executed using the following
settings: Data size: 12MB, 24MB, 50MB and 500MB. Fixed
parameters: 200 iterations, 10 decision variables, 100 particles
in the swarm and repository size 20.</p>
        <p>From experiment 4, the execution time of the 12MB dataset
was at 54.2 while 500MB was at 17.8 seconds respectively.</p>
        <p>VII. DISCUSSION</p>
        <p>During experiments, we noted that both serial and parallel
implementations can be scaled easily and be used for larger
swarm sizes in order to handle more complex optimization
problems. We observed that during the first iteration in the
Parallel Computing Toolbox, the algorithm performed slower
but then it continually improved from the second and
consecutive iterations. In a serial environment, execution time
was constantly going up when we increased the values in
the parameter settings. Nonetheless, implementation using the
Parallel Computing Toolbox showed very promising results.
When we increased the number of particles in the swarm and
reduced the repository size in every iteration, the performance
of the algorithm showed a gradual improvement. Better still,
when we scaled up and increased the data size, there was
a significant improvement in the execution time.In all the
experiments we kept the decision variables and the number
of iterations constant. For the different data sizes which we
used, we observed that execution time of a MapReduce job in a
smaller dataset was of no great significance to the performance
of the algorithm as compared to when we increased the data
size. This attributes to the fact that MapReduce programming
technique is suitable for processing data which cannot fit into
the local computer memory. Therefore, using a smaller data
size is not suitable for running MapReduce applications. If</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>VIII. RECOMMENDATIONS</title>
      <p>Based on the range of values tested, there is need to
establish some guidelines which should be followed while
tuning optimization parameters used by PITCH. Some of our
recommendations for effective optimization are as follows:</p>
      <sec id="sec-6-1">
        <title>A. Number of particles in the swarm:</title>
        <p>This is the population size of the algorithm and we
recommend the population size to be from 100 to 200 particles.</p>
      </sec>
      <sec id="sec-6-2">
        <title>B. Number of Iterations:</title>
        <p>This parameter relates to the particles in the swarm. The
relationship is inversely proportional whereby if the number
of particles is lower, then the cycles should also be smaller.
The recommended use is between 100 and 200. .</p>
      </sec>
      <sec id="sec-6-3">
        <title>C. Repository size:</title>
        <p>These parameter gives us the maximum number of
nondominated members to be stored in the repository. It also
determines the quality produced in each pareto front. We
recommend the repository size to be between 20 and 50.</p>
      </sec>
      <sec id="sec-6-4">
        <title>D. Inertia weight and damping rate:</title>
        <p>This parameter has a significant effect on the convergence
and exploration. Inertia weight should be set to 0.49 and the
damping rate to 0.99.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>IX. CONCLUSION</title>
      <p>
        MOPSO concept is emerging due to its usefulness in
optimizing more than one objective function at the same time.
In PITCH, instead of using a single solution, a set of solutions
known as pareto optimal sets are determined. Most real world
problems including those in big data processing are multi
objective hence involving simultaneous optimization in solving
these problems is important. This can only be achieved by the
use of algorithms. It is also significant to note that the way
objective functions are measured is always competing and
conflicting[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Therefore having competing objective functions
is what gives rise to a set of multiple solutions instead of a
single optimal solution. This is because no solution is better
than another when all objectives are put into consideration.
In our research, the main aim was to optimize parameters
having two objective functions with the goal of exploring
convergence and diversity by adjusting specific features.From
the results, PITCH has proved to be very powerful in dealing
with multi objective problems in the best way possible as well
as providing a set of good solutions putting into consideration
pareto optimality. [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. Nonetheless, just as other Multi
Objective Evolutionary Algorithms(MOEA), PITCH algorithm can
decline during multi objective optimization and may also face
a problem of convergence and diversity.
      </p>
      <p>Future work includes exploring other features of PITCH in
search for a more diversified estimates of the pareto front,
without losing convergence. Also, PITCH will be analyzed
in other benchmarking problems which involve maximizing
an objective function. There is more to do in areas which
emphasize on PITCH parameters’ self-adaptation, efficiency
as well as application work in theoretical development.</p>
      <p>X. ACKNOWLEDGMENT</p>
      <p>
        The authors would like to acknowledge MathWorks® for
providing MATLAB R2020a[
        <xref ref-type="bibr" rid="ref37">37</xref>
        ], the main software which
was used in the implementation of this research. More direcly
to Thomas F. Coleman following his contribution towards
constrained minimization functions and the use of the
Optimization Toolbox™. Last but not least, Jomo Kenyatta University
of Agriculture and Technology for permitting the authors to
spend time and resources in the University.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Cavanillas</surname>
          </string-name>
          , E. Curry, and W. Wahlster, “
          <article-title>New Horizons for a DataDriven Economy: A Roadmap for Usage and Exploitation of Big Data in Europe,” New Horizons a Data-Driven Econ. A Roadmap Usage Exploit</article-title>
          .
          <article-title>Big Data Eur</article-title>
          ., pp.
          <fpage>1</fpage>
          -
          <lpage>303</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K. N.</given-names>
            <surname>Aye</surname>
          </string-name>
          , “
          <article-title>A Platform for Big Data Analytics on Distributed Scaleout Storage System A Platform for Big Data Analytics on Distributed Scale-out Storage System Kyar Nyo Aye University of Computer Studies</article-title>
          , Yangon A thesis submitted to the University of Computer Studi,” no.
          <source>November</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Francis</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Kurian</surname>
          </string-name>
          <string-name>
            <surname>K</surname>
          </string-name>
          , “
          <article-title>Data Processing for Big Data Applications using Hadoop Framework,” Ijarcce, no</article-title>
          .
          <source>April</source>
          , pp.
          <fpage>177</fpage>
          -
          <lpage>180</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>I. Technologies</surname>
          </string-name>
          , “
          <article-title>Map Reduce a Programming Model for Cloud Computing Based On Hadoop Ecosystem</article-title>
          ,” vol.
          <volume>5</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>3794</fpage>
          -
          <lpage>3799</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kihl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , and W. Zhang, “
          <article-title>CF4BDA: A Conceptual Framework for Big Data Analytics Applications in the Cloud,” IEEE Access</article-title>
          , vol.
          <volume>3</volume>
          , no.
          <source>March</source>
          <year>2017</year>
          , pp.
          <fpage>1944</fpage>
          -
          <lpage>1952</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Alam</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Ara Shakil</surname>
          </string-name>
          , “
          <article-title>Big Data Analytics in Cloud environment using Hadoop Mansaf Alam</article-title>
          and Kashish Ara Shakil Department of Computer Science, Jamia Millia Islamia, New Delhi,” Dep.
          <source>Comput. Sci. Jamia</source>
          Millia Islam. New Delhi.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Rajyaguru</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Vinay</surname>
          </string-name>
          , “
          <article-title>A Comparative Study of Big Data on Mobile Cloud Computing,”</article-title>
          <source>Indian J. Sci. Technol</source>
          ., vol.
          <volume>10</volume>
          , no.
          <issue>21</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <source>The Application of MapReduce in the Cloud Computing. 2011 2nd International Symposium on Intelligence Information Processing and Trusted Computing</source>
          , Hubei,,
          <volume>154</volume>
          -
          <fpage>156</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Daneshyar</surname>
          </string-name>
          , “
          <article-title>Large-Scale Data Processing Using MapReduce in Cloud Computing Environment,”</article-title>
          <string-name>
            <given-names>Int. J. Web</given-names>
            <surname>Serv</surname>
          </string-name>
          .
          <source>Comput.</source>
          , vol.
          <volume>3</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Voruganti</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Map Reduce a Programming Model for Cloud Computing Based On Hadoop Ecosystem</article-title>
          .
          <source>(IJCSIT) International Journal of Computer Science and Information Technologies</source>
          , Vol.
          <volume>5</volume>
          (
          <issue>3</issue>
          ) ,
          <year>2014</year>
          ,
          <fpage>3794</fpage>
          -
          <lpage>3799</lpage>
          ,
          <fpage>3794</fpage>
          -
          <lpage>3799</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Guangdeng</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. D.</surname>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Gunther: Search-Based Auto-Tuning of MapReduce</article-title>
          .
          <source>Part of the Lecture Notes in Computer Science book series (LNCS</source>
          , volume
          <volume>8097</volume>
          ).
          <source>European Conference on Parallel Processing</source>
          ,
          <fpage>406</fpage>
          -
          <lpage>419</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          et al., “Mronline:
          <article-title>MapReduce online performance tuning</article-title>
          ,
          <source>” HPDC 2014 - Proc. 23rd Int. Symp. High-Performance Parallel Distrib. Comput.</source>
          , pp.
          <fpage>165</fpage>
          -
          <lpage>176</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , and M. Khan, “
          <article-title>Optimizing hadoop parameter settings with gene expression programming guided PSO,” Concurr</article-title>
          . Comput. , vol.
          <volume>29</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Britto</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Pozo</surname>
          </string-name>
          ,
          <string-name>
            <surname>“</surname>
            <given-names>I-MOPSO</given-names>
          </string-name>
          :
          <article-title>A suitable PSO algorithm for many-objective optimization</article-title>
          ,
          <source>” Proc. - Brazilian Symp. Neural Networks, SBRN</source>
          , pp.
          <fpage>166</fpage>
          -
          <lpage>171</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C. A. Coello</given-names>
            <surname>Coello</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Lechuga</surname>
          </string-name>
          , “
          <article-title>MOPSO: A proposal for multiple objective particle swarm optimization</article-title>
          ,
          <source>” Proc. 2002 Congr. Evol. Comput. CEC</source>
          <year>2002</year>
          , vol.
          <volume>2</volume>
          , pp.
          <fpage>1051</fpage>
          -
          <lpage>1056</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Kennedy</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eberhart</surname>
            <given-names>R</given-names>
          </string-name>
          .
          <article-title>Particle swarm optimization</article-title>
          .
          <source>In Proceedings., IEEE International Conference on Neural Networks</source>
          <year>1995</year>
          .
          <year>1995</year>
          ; 4:
          <fpage>1942</fpage>
          -
          <lpage>1948</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A. W.</given-names>
            <surname>McNabb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. K.</given-names>
            <surname>Monson</surname>
          </string-name>
          , and
          <string-name>
            <surname>K. D. Seppi</surname>
          </string-name>
          , “
          <article-title>Parallel PSO using MapReduce,” 2007 IEEE Congr</article-title>
          .
          <source>Evol. Comput. CEC</source>
          <year>2007</year>
          , vol.
          <volume>15213</volume>
          , pp.
          <fpage>7</fpage>
          -
          <lpage>14</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Narayan</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Shetty</surname>
          </string-name>
          , “
          <article-title>Handling Big Data Analytics Using Swarm Intelligence</article-title>
          ,” vol.
          <volume>2</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>271</fpage>
          -
          <lpage>275</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>W.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. G.</given-names>
            <surname>Yen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , “
          <article-title>Multiobjective particle swarm optimization based on Pareto entropy,” Ruan Jian Xue Bao/Journal Softw</article-title>
          ., vol.
          <volume>25</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>1025</fpage>
          -
          <lpage>1050</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Leiva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Pardo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Aguado</surname>
          </string-name>
          , “
          <article-title>Data analytics-based multiobjective particle swarm optimization for determination of congestion thresholds in LV networks,” Energies</article-title>
          , vol.
          <volume>12</volume>
          , no.
          <issue>7</issue>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Yang</surname>
          </string-name>
          , “
          <article-title>A review of multi-objective particle swarm optimization algorithms in power system economic dispatch</article-title>
          ,
          <source>” Int. J. Simul. Syst. Sci. Technol</source>
          ., vol.
          <volume>17</volume>
          , no.
          <issue>27</issue>
          , pp.
          <volume>15</volume>
          .
          <fpage>1</fpage>
          -
          <issue>15</issue>
          .5,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lamont</surname>
          </string-name>
          ,
          <article-title>Evolutionary Algorithms for Solving Multi-Objective Problems</article-title>
          , no.
          <source>May</source>
          .
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Pospelova</surname>
          </string-name>
          , “
          <article-title>Real Time Autotuning for MapReduce on Hadoop</article-title>
          /YARN,”
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Li</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Pan</surname>
          </string-name>
          , “
          <article-title>M2M: A simple Matlab-toMapReduce translator for cloud computing,”</article-title>
          <source>Tsinghua Sci. Technol</source>
          ., vol.
          <volume>18</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehrjoo</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Dehghanian</surname>
          </string-name>
          , “
          <article-title>Mapreduce Based Particle Swarm Optimization for Large Scale Problems</article-title>
          ,”
          <source>AICS 2015 Proceeding 3rd Int. Conf. Artif. Intell. Comput. Sci</source>
          ., no.
          <source>October</source>
          , pp.
          <fpage>12</fpage>
          -
          <lpage>13</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>X.</given-names>
            <surname>Yong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ying</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Yanjun</surname>
          </string-name>
          , “
          <article-title>Research on cloud computing and its application in big data processing of railway passenger flow,” Chem</article-title>
          . Eng. Trans., vol.
          <volume>46</volume>
          , no.
          <year>2011</year>
          , pp.
          <fpage>325</fpage>
          -
          <lpage>330</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lalwani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chandra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kusum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jagdish</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bansal</surname>
          </string-name>
          , “
          <article-title>REVIEW - COMPUTER ENGINEERING AND COMPUTER SCIENCE A Survey on Parallel Particle Swarm Optimization Algorithms</article-title>
          ,” Arab.
          <source>J. Sci. Eng</source>
          .,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Roth</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Winter</surname>
          </string-name>
          , “
          <article-title>Compiling MATLAB M-Files for Usage Within an MATLAB Compiler mcc</article-title>
          ,” pp.
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zitzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Deb</surname>
          </string-name>
          , and L. Thiele, “
          <article-title>Comparison of multiobjective evolutionary algorithms: empirical results</article-title>
          .,
          <source>” Evol. Comput.</source>
          , vol.
          <volume>8</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>173</fpage>
          -
          <lpage>195</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>W. J.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Jambek</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Neoh</surname>
          </string-name>
          , “
          <article-title>Kursawe and ZDT functions optimization using hybrid micro genetic algorithm (HMGA),” Soft Comput</article-title>
          ., vol.
          <volume>19</volume>
          , no.
          <issue>12</issue>
          , pp.
          <fpage>3571</fpage>
          -
          <lpage>3580</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lalwani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singhal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Gupta</surname>
          </string-name>
          , “
          <article-title>a Comprehensive Survey: Applications of Multi-Objective Particle Swarm Optimization (Mopso) Algorithm,”</article-title>
          <source>Trans. Comb. ISSN</source>
          , vol.
          <volume>2</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>2251</fpage>
          -
          <lpage>8657</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Mullick</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Suganthan</surname>
          </string-name>
          , “
          <article-title>Recent advances in differential evolution-An updated survey</article-title>
          ,
          <source>” Swarm Evol. Comput.</source>
          , vol.
          <volume>27</volume>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>30</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>K. A.</given-names>
            <surname>Venkatesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Neelamegam</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Revathy</surname>
          </string-name>
          , “
          <article-title>Using MapReduce and load balancing on the cloud: Hadoop MapReduce and virtualization improves node performance Cloud architecture</article-title>
          ,” no.
          <source>July</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          , and W. Chen, “
          <article-title>Learning-based Automatic Parameter Tuning for Big Data Analytics Frameworks,”</article-title>
          <source>Proc. - 2018 IEEE Int. Conf. Big Data, Big Data</source>
          <year>2018</year>
          , pp.
          <fpage>181</fpage>
          -
          <lpage>190</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>M.</given-names>
            <surname>Khan</surname>
          </string-name>
          , “
          <article-title>Hadoop Performance Modeling and Job Optimization for Big Data Analytics,”</article-title>
          <source>Brunel Univ. London, no. March</source>
          , p.
          <fpage>157</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36] S. Cheng,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Qin</surname>
          </string-name>
          , “
          <article-title>Big data analytics with swarm intelligence,”</article-title>
          <string-name>
            <given-names>Ind. Manag. Data</given-names>
            <surname>Syst</surname>
          </string-name>
          ., vol.
          <volume>116</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>646</fpage>
          -
          <lpage>666</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37] @bookMATLAB:
          <year>2020</year>
          ,
          <article-title>year = 2020,author = MATLAB,title = version 7</article-title>
          .10.0 (
          <issue>R2020a</issue>
          ),
          <article-title>publisher = The MathWorks Inc</article-title>
          .,address = Natick, Massachusetts.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>[38] https://www.mathworks.com/products/parallel-computing.html</mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>P. V.</given-names>
            <surname>Raja</surname>
          </string-name>
          and E. Sivasankar, “
          <article-title>Modern framework for distributed healthcare data analytics based on hadoop</article-title>
          ,
          <source>” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics)</source>
          , vol.
          <volume>8407</volume>
          LNCS, pp.
          <fpage>348</fpage>
          -
          <lpage>355</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Medel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rana</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Banares</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arronategui</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Modelling performance and resource management in Kubernetes</article-title>
          .
          <source>Proceedings - 9th IEEE/ACM International Conference on Utility and Cloud Computing</source>
          ,
          <string-name>
            <surname>UCC</surname>
          </string-name>
          <year>2016</year>
          , (
          <year>September 2019</year>
          ),
          <fpage>257</fpage>
          -
          <lpage>262</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>