Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) Job Migration for Load Balancing Algorithm in Grid Computing Using Queue Length parameter Ali Wided Bouakkaz Fatima Department of Mathematics and Computer Science Department of Mathematics and Computer Science University of tebessa University of tebessa Tebessa, Algeria Tebessa, Algeria wided.ali@univ-tebessa.dz f.bouakkaz@univ-tebessa.dz aliwided1984@gmail.com f_bouakkez@esi.dz Abstract— based on the previous works, we used the job is enhancing the performance of application by minimizing migration technique for hierarchical load balancing. In this slowdown, and the waiting time in the global queue, paper the authors propose a novel job migration algorithm for maximizing the resources usage rate and load balancing dynamic load balancing (JMADLB), in which parameters(such among the resources. as CPU queue length) have been considered which is used for the selection of overloaded resources (or underloaded ones) in grid. . II. RELATED WORKS In dynamic load balancing state, the system will change The authors[5] have suggested a Priority based Dynamic dynamically until it reaches a balance. In a grid environment, efficiency of resources varies with time. Thus, the allocation of Load Balancing Algorithm (PBDLB). When a node is jobs must be adjusted dynamically in according with the overloaded, it calls the MSN which then finds a suitable node variation of the resources status. The proposed algorithm has and then performs the load balancing, a function msn ( ) finds been verified through Alea2 simulator and the simulation results the available under-loaded nodes by looking into a queue validate that the proposed algorithm allow us to reach our where all the processors are scheduled in the decreasing order objectives. of their computing power. here CPU queue length is considered. The Job Migration strategy is used through which Keywords— grid computing, load Balancing, job Migration, the migration of jobs takes place from the heavily node to the workload, information policy, location policy, selection policy, lightly-loaded node. The advantage of this algorithm is that it resources Allocation.component; takes into account the resource processing capability, where the nodes with high computing power have high priority; also I. INTRODUCTION it decreases the communication overhead and proves to be cost A computational Grid is a hardware and software real. The drawback of this study is not considering the fault infrastructure that gives dependable, consistent, pervasive, and tolerance. cheap access to high-end computational capabilities [1]. The In the study of [6] an Augmented Hierarchical Load challenges in grid computing lie in load balancing. Load Balancing with Intelligence Algorithm is proposed (AHLBI). balancing is an important issue for the problem of utilization. When a job request comes, the scheduler initializes job and It is those techniques which are designed to equally distribute cluster parameters comes, the scheduler initializes job the load on resources and maximize their utilization. These parameters and calculates the Expected computing power, techniques can be approximately categorized as centralized or ECP for each job together with ALC, Average System Load decentralized, dynamic or static, periodic or non-periodic [2]. and ACP of clusters before job allocation. The algorithm find Main purpose of load balancing is to enhance the response the deviation of ALC with the Average system load and find time of the application by which workload would be saved out the probability value of deviation for every cluster. If the according to resources. There are causes which are the major probability of deviation is within the range of 0 and 1, the raisons of load balancing, resubmission of jobs and job cluster is marked as under loaded. The ACP of under loaded migration; heterogeneity of resources, dynamic nature of clusters is compared with the ECP of jobs. If the ACP value resource’s performance and diversity of applications in case of of a cluster is less than or equal to ECP of jobs, the cluster is Grids[3]. This is even more vital in computational Grid where considered as fittest and job is allocated to it. After job the main concern is to equally allocate jobs to resources and to allocation to clusters, some clusters may remain underutilized. minimize the difference between the overloaded and the To avoid this, AHLBI compares the queue length of all the underloaded resource load [4]. Efficient load balancing clusters. Jobs from clusters with large queue size are stolen through the Grid is required for improving performance of the and allocated to free clusters. Similarly, when the number of system. The overloaded grid resources can be balanced by jobs waiting to be executed in a cluster’s queue increases, jobs migrating jobs to the idle processors, i.e. a set of processors to from queue tail is allocated to free clusters for execution. the which a processor is directly connected [4]. Our contributions advantages of this algorithm are reducing idle time of clusters are: First, we proposed a hierarchical load balancing and makespan. The drawback of this strategy is that it does not algorithm. Second, we verified the proposed algorithm through Alea 2 simulator. The objective of proposed algorithm take into account the resource processing capacity and the gathering information algorithm fault tolerance. Begin In this paper [7], the authors proposed the Distributed T 5 seconds Load Balancing Model for Grid Computing that represents a Waiting for jobs; Grid topology based on a forest structure. Jobs migration is Create jobs queue for each node; presented on two levels, namely (a) intra-cluster and (b) inter- For every Node N and in each one second of T intervals do cluster load balancing. The nodes of the cluster send their load Calculate (Qlength); information to cluster managers. Cluster manager is End For responsible of saving the nodes load information and also Load (Qlength)=(Q0+Q1+…..QT)/T; distribution of the information with other cluster managers According to its period cluster manager receives Load through inter-cluster communication. The advantage of this informations from all nodes and compute load of cluster C algorithm is that it takes into consideration the heterogeneity associated. of the resources, it reduces the response time and the Cluster manager Sends Load information of C to Grid communication cost. The drawback of this strategy is that it manager does not take into account the resource processing capabilities Loop and the fault tolerance. wait for load change // happening of any of defined events III. PROPOSED LOAD BALANCING ALGORITHM if (events_happens ()=1 or events_happens ()=4) then begin The proposed Load Balancing implements three policies: Remove terminated or migrated job from the waiting queue Information Policy, selection Policy and location Policy. For Subtract their load value from the total local load of node. implementation of Information Policy we use activity based approach. We use FIFO strategy For implementing the Send new load to its cluster manager associated ; Selection Policy, for implementation of location policy We End use as Load Index Queue Length. On the basis of Load Index if (events_happens ()=2 or events_happens ()=3) then Load Balancer decides to activate Load Balancing process. begin Add the newly created or incoming job for the waiting queue Notations used in our algorithms are summarized and Add their load value for the total local load of node shown in Table I: Send new load to its cluster manager associated; TABLE I. NOTATIONS USED End Parameter Description If ((events_happens () =6) and (events_happens () =7)) then N Node begin LoadR Load of resource ask the slowest resource to send a portion of its load to Qlength Queue length of resource the idle resource . THH The higher threshold End THL The lower threshold OLD-list Overloaded List End Loop ULD-list Underloaded List BLD-list Balanced List Function events_happens () Loadavg Average Load output Type: integer NBRN Number of Nodes of cluster c C cluster begin If (Job.state=Termination) then events_happens () =1; If (Job.state=Start) then events_happens () =2; A. Intra-cluster load balancing algorithm If (Job.state=Incoming Migrating ) then events_happens ()=3; Depending on its current load, each cluster manager If (Job.state = migrated) then events_happens ()=4; decides to start a Job Migration operation. In this case, the If (Arrival of any new resource) then events_happens ()=5; cluster manager tries, in priority, to balance its Load among its If(resouce.state= idle) then events_happens () =6; nodes. To implement this local load balancing, we propose the if(resource.state= slowest) then events_happens ()=7; following algorithms: if(cluster.state=saturated then events_happens ()=8; Algorithm 1: Information policy if (cluster.state=unbalanced) then events_happens ()=9; end We considered the load of node at given time was described simply by CPU queue length. it denotes the number Algorithm 2: Location policy of processes which are waiting to be executed. This algorithm classifies the nodes according to their load. We calculated this parameter as follow: it used three states for classifying: overloaded, underloaded and balanced. In the first time, we must calculate two Load (Qlength) = (Q1+Q2+……...+QT)/T threshold values for Qlength parameter. Where:Q1,Q2,……...,QT is the value of Qlength in a previous one second interval. The calculation of these thresholds is done as follow: T is the number of time intervals. Calculate load average of Qlength parameter over all End For nodes Sort OLD_list by descending order relative to their Load N. Loadavg(Qlength)=(load1+load2+….loadn)/nbr; Where Sort ULD_list by ascending order relative to Their Load N. End. Loadavg(Qlength) is the average load of Qlength over all nodes. Algorithm 3: Job Migration Decision load1,load2,….loadn are the current load of Qlength of each After classifying the nodes, in the next step cluster manager node calculated by Load estimation algorithm. nbr is the decide to migrate jobs from overloaded to under-loaded nodes, number of nodes. it applies the following algorithm: Calculate the threshold values Job Migration Decision algorithm The higher and lower threshold values of Qlength begin parameter are calculated by multiplying the average load of While (OLD-list ≠ .AND. ULD-list ≠ ) do Qlength and a constant value. For i = 1 To ULD-list.Size() do THH=H*Loadavg Select job from queue of first node belonging to OLD-List by FCFS algorithm THL=L* Loadavg Migrate the selected job from first Where THH is the high threshold and THL is the low Sender node of OLD-List to ith receiver threshold node of ULD-list; Update the current LoadN of receiver H and L are constants. and sender nodes; The next step is to partition the nodes for balanced, Update OLD-list, ULD-list and BLD-list; overloaded and underloaded nodes by using the threshold Sort OLD-list by descending order of their values as follow: LoadN;  Overloaded : the node will be added for overloaded End For list if Qlength is high, that is mean if the number of End jobs in the queue of node is high then the node is classified as overloaded node. B. Inter-cluster load balancing algorithm  Underloaded: the resource will be added for This algorithm applies a global load balancing among all underloaded list if Qlength is low. clusters of the Grid. The Inter-cluster load balancing at this level is made if a cluster manager fails to balance its Load  balanced :the node are not into overloaded list and among its associated nodes. In this case the grid manager underloaded list are the node in the balanced load state migrates Jobs from overloaded clusters to under loaded they are considered as more loaded than the low state clusters. We propose the following algorithms: and less loaded than the high state. Inter-cluster load balancing algorithm Location policy algorithm Begin begin If(events_happens ()=8) then Inter-cluster load balancing According to period T do algorithm Grid manager receives Load informations of clusters from its somme 0; Cluster Managers . For every Node N of cluster C do Grid manager collect related informations of its clusters in the Somme Somme+ LoadN(Qlength) ; clusters information table; End For Grid manager partitions grid into overloaded (OLD), under- Loadavg(Qlength)= somme1/NBR-N; loaded (ULD) and balanced (BLD) clusters; THH(Qlength)= Loadavg(Qlength)*H; Create OLD_clusters_table; THL(Qlength)= Loadavg(Qlength)*L; Create ULD_clusters_table; Partionning Nodes into overloaded list OLD-list , underloaded Sort clusters Cj of OLD_clusters_table by Descending order list ULD-list and balanced list BLD-list of their Load; OLD-list ; ULD-list← ; BLD-list← ; For Every cluster Cj of OLD_clusters_table Do For every Node N of cluster C do begin If (LoadN(Qlength) )>THH(Qlength) then Sort clusters Cr of ULD_clusters_table by Ascending OLD-list ←OLD-list N; order of their Load Else If (LoadR(Qlength) )< THL(Qlength)) Sort nodes of Cj by descending order of their Load then ULD-list ULD-list N While (OLD_clusters_table ≠ Φ AND ULD_clusters_table ≠ Else BLD-list BLD-list N; Φ) Do Begin End If Sort the clusters Cr of ULD_clusters_table by ascending order of inter clusters (Ci-Cr) WAN 3. Average resource usage rate bandwidth sizes. C. Performance Evaluation Sort the nodes of Ci by descending order of their The proposed algorithm provides better Job Migration load mechanism with DLBA. The important performance factors in Sort Jobs of first node of Ci by FCFS algorithm and estimating our proposed algorithm are decreasing response communication cost time, reducing slowdown and maximizing resource utilization. Migrate the selected job from the first node of Ci to We performed some experiments for evaluating efficiency and jth cluster of ULD_clusters_table performance of the proposed algorithm. Update the current Load of receiver cluster Update ULD_clusters_table, and OLD_clusters_table Experimentations 1: End In the first experimentation, we have focused on the Endfor average response time (in sec), according to various numbers End of jobs and clusters. We have supposed different numbers of clusters and we considered that each cluster is composed of IV. EXPERIMENTAL RESULTS various numbers of resources. The results showed that We implemented the proposed load balancing algorithm in proposed algorithm surpassed other algorithms by decreasing Java using JDK 1.8. The implementation is done on the average response time. In FCFS (first come, first served) a Alea 2 grid simulator[8]. algorithm, if the resource demanded by the first job in the queue is not available, the remaining jobs cannot schedule A. Simulated parameters even if the demanded resources are available. In the proposed In order to evaluate the feasibility and the performance of algorithm, it works as FCFS in the job selection but when the our algorithms we have tested them in Alea 2 Grid simulator. first job in the queue cannot be scheduled directly the We utilized the following parameters: proposed algorithm estimate the earliest probable starting time for the first job using the processing time calculated of 1. Resource parameters: these parameters give running jobs. Then, it makes a reservation to run the job at this information about available resources during load pre-estimated time. Next, it examines the queue of waiting balancing period such as: jobs and directly schedules every job not intervening with the a. number of Clusters reservation of the first job. b. number of resources in each Cluster From figure 1 and 2, we can perceive that the proposed algorithm works much better than FCFS and EDF(Earliest c. size of memory (RAM) deadline first ) since jobs are equally distribute over available d. date to send load information from resources clusters. The average response time and average slowdown are slightly better for the proposed algorithm. we have tried e. tolerance factor. solving the problem of staturation by preventing the 2. job parameters: these parameters include: overloaded resources and we don’t permit receiving more jobs. Evidently, simple solution is not enough for more a. number of jobs queued at every resources; complex problems. We will try to fully comprehend this phenomenon in the future since it is outside the focus of this b. arrival time, waiting time, submission time paper. ,start time , processing time and finish time c. job length d. job priority 3. Network parameter: LAN and various WAN bandwidth sizes. 4. Load index: we have used CPU queue Length Where CPU queue Length denotes number of waiting jobs in queue of resource. B. Performance parameters We have focused on the following parameters: 1. Average response time: the response time is the time a job spends in the system which means the time from its arrival to its termination. 2. Slowdown: denotes as the ratio of response time to processing time . Slowdown= Max (response time/processing time) of all jobs. Fig 1. Comparison of avg response time(in sec), between FCFS, EDF and JMADLB with 20 clusters Fig 2.Comparison of avg slowdown(in sec) between FCFS, EDF and JMADLB with 20 clusters Fig 3. Comparison of Cluster Utilization (%) between FCFS, EDF and JMADLB with 14 clusters Experimentations 2: In the second experimentation, we have focused on the resource utilization (%). We have supposed number of clusters is 14, and we considered that each cluster is composed of Experimentations 3: various numbers of resources. Number of jobs is 3000. Figure In the third experimentation, we have focused on waiting 3 shows the cluster utilization of different algorithms. job in queue and we have compared it with running job. We Examination of the cluster utilization showed that FCFS have supposed the number of clusters is 20 and we considered and EDF did not perform well. It shows that the cluster-11 that each cluster is composed of various numbers of resources. which is having the highest processing resources is over Number of jobs is 3000. utilized, while clusters 2, 4, 5, 8 and 9 are idle in FCFS and EDF scheduling algorithms. The reason behind improvement is even utilization of all clusters which is achieved because JMADLB balances the load between clusters at the time of scheduling. It shows that the JMADLB is better performed compared to traditional scheduling algorithms, also because load balancing is used to make sure that none of existing clusters are idle while others are being utilized. The load balancing effects are caused by under-loaded clusters. In the proposed algorithm there is an increase of utilization of cluster 2 from 0% (before JMADLB algorithm) to 8% (after) and cluster 4 from 0% to 2.5% and cluster 8 from 0% to 4% and cluster 9 from 0% to 30%. In this case, the different utilizations of the participating clusters are balanced. On the other hand, jobs with specific requirements have to wait until the suitable resources become available. This in fact generates higher system utilization on particular clusters. We have tried Fig 4 . Comparison of running jobs and waiting jobs of JMADLB algorithm with 20 clusters solving that by migrate the jobs for idle resources for load balancing and preventing the overloaded clusters. Moreover the JMADLB algorithm permits the scattering of the job on the most available resources when there was no appropriate resource, unlike the other scheduling algorithms that try to select the best resource resembling to the job requirements; otherwise, the job will stay in the global queue, which indicates an under-utilization of the resources. V. CONCLUSION AND FUTURE WORK In this paper we have presented a load balancing algorithm in grid computing environment. To validate the proposed algorithm, we have used a Grid simulator in order to measure its performance. The first experimentation results are very promising and lead to a better load balancing between nodes of a Grid without high computing overhead. We have obtained good results especially for resource utilization. In the future, the authors want to develop the proposed algorithm by adding the multi-agent systems and we will run our algorithm in decentralized manner. Nevertheless, our algorithm has some limitation that the authors intend to address in the future. In this work, the authors did not study the effect of increasing the number of Job Migration and the performance degradation due to the migration in addition to the drawback of the centralized Fig 5 . Comparison of running jobs and waiting jobs of EDF algorithm with 20 clusters system. So for the future work, the authors will be interested by these directions. References [1] Foster, C. Kesselman, J.M. Nick, and S.Tuecke.‚Grid services for distributed system integration'. IEEE Computer, 35(6):pp.37-46,2002. [2] Y. Zomaya, Y.Teh, Observations on using genetic algorithms for dynamic load-balancing, IEEE Transactions on Parallel and Distributed Systems, vol. 12, No.9, 2001, pp.899-911 . [3] Belabbas Yagoubi , Hadj Tayeb Lilia and Halima Si Moussa , 2006. Load Balancing in Grid Computing. Asian Journal of Information Technology, 5: 1095-1103. [4] B.Yagoubi ,Y. Slimani,Job Load Balancing Strategy for Grid Computing .Journal of Computer Science 3(3),pp.186-194,2007. [5] S. Kumar, and N. Singhal. (Jul, 2012). A Priority based Dynamic Load Balancing Approach in a Grid based Distributed Computing Network, International Journal of Computer Applications, Vol. 49, No.5, pp. 819- 828. Fig 6. Comparison of running jobs and waiting jobs of FCFS algorithm with [6] Joshua Samuel Raj, Hridya K. S and V. Vasudevan, Augmenting 20 clusters Hierarchical Load Balancing with Intelligence in Grid Environment , International Journal of Grid and Distributed Computing Vol. 5, No. 2, June, 2012. [7] B. Yagoubi, and M. Meddeber. (2010). Distributed Load Balancing The horizontal axis represents time (units of days) while Model for Grid Computing, Revue ARIMA, Vol. 12, pp. 43-60. the vertical axis indicates that the number of jobs. The red [8] Dalibor Klusáček and Hana Rudová.(2010). Alea 2 job scheduling simulator. In Proceedings of the 3rd International Conference on curve shows waiting job who says that a job in the waiting Simulation Tools and Techniques (SIMUTools 2010), Torremolinos, jobs queue, the green curve shows running job which say that Malaga, Spain. a job is running or executing. EDF and FCFS are not able to Website: Dalibor Klusáček and Hana Rudová .Retrieved from schedule jobs easily, generating greatest waiting jobs during http://www.fi.muni.cz/~xklusac/workload the time. For 3000 jobs and with 20 clusters JMADLB, is capable of a higher resource utilization and there is no waiting jobs in the queue through the time as can be seen in figure 4.