This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
The distributed system consists of processing elements which are provided with task. But the task scheduling is a tedious process for which various algorithms have been proposed and in this paper a survey is done on the four of such algorithms such as genetic algorithm, particle swarm optimization, fuzzy model and dynamic critical path duplication. Their pros and cons are discussed with a comparative study.
In distributed system task scheduling is very much important.it can be either local or global scheduling. What a distributed system mean? It is a group of loosely arranged processing units. If local scheduling is done then task is scheduled on the basis of time slices of each processing units. If we consider a task to be scheduled or not then it is global scheduling.
We have majorly two schemes of task scheduling. One is static scheduling which uses graph theoretic or queuing theoretic approach to obtain the optimal solution. But the case is the behaviour of the task must be known earlier. The second one is dynamic scheduling in which the constrain of knowing the behaviour of task is taken in addition to that state of the machine before scheduling is also considered. Adaptive and non-adaptive are the two classifications of dynamic schemes.
The task sharing and task balancing are two concepts which are centred on the task scheduling .This paper provide a comparison for various dynamic task scheduling algorithms. It gives the clear-cut difference and similarities of various algorithm schemes like genetic algorithm, particle swarm optimization, fuzzy logic and dynamic critical path duplication.
Genetic algorithms (GAs) are searching techniques which use meta heuristics. These provide random search which scans through the sample space and provide optimal solutions.
Particle swarm optimization (PSO) is a population based stochastic optimization technique, which shows general behaviour of bird flocking or fish schooling. Fuzzy logic is a logic based on multiple values which makes intermediate results in between evaluations in the form of two values yes or no, high or low etc. Dynamic critical path duplication
II. ALGORITHM ââ‚¬"COMPARATIVE STUDY
A) Genetic Algorithm
Task allocation problem is a problem faced by the scheduler in a distributed system.to overcome that genetic algorithm was proposed which minimises the execution time of the scheduler. Population of the possible solution is taken in to account.
The genetic algorithm is a meta heuristic technique which looks for polynomial time solution space.it uses previous search information with modulations that are generated randomly.
Mainly three steps are involved in this technique
i) Selection based on the efficiency
ii) Crossover promotes exploration
iii) Random mutation promotes exploration
The algorithm was developed based on a schedule developed by zomaya et al known as the stage of the art homogeneous GA scheduler.
The working is as the unscheduled tasks are first allotted to the queue of each processor. when the scheduler is invoked batches of task from this queue are generated and scheduled each idle processor takes up this batch of task and processes them. The processor due to limitation of network resource does not contain a queue of its owns. If large number of tasks is present, it will increase the speed of the scheduler and to reduce the idle period of the processor. Larger batch can be used so we can use settling of batch size dynamically according to time till the first processor becomes idle. The resources availability fluctuations can be minimised by using smoothing function.
Encoding: it is the process of batching in which each character unique id number of a task is a mapping between a task and a processor.
No: of character =no of task +no: of process-1
Fitness function: this function attaches a value of goodness of schedule to each individual in the population.
The fitness value generated by relative error load is determined by the finishing time which is used to calculate the processing time which is optimal. The schedule which is better has a larger value.
Selection, crossover & mutation: The standard weighted roulette wheel method is used for the selection. A slot is allocated for each individual in between 1 and 0.then cyclic crossover is promoted. finally mutation which uses two methods where former swaps the elements of individuals and later rebalances heuristic to mutate. these technique results in population improvement.
Stopping conditions: The minimal total execution time is considered and if it is less than specified value then algorithm stops working. Another alternative is one of the processor becomes idle.
The complexity of the genetic algorithm is O (N2) where N is the no of task which is fed into the algorithm.
B) Particle swarm optimisation
In this algorithm a swarm or group of individuals (particles) searches in the solution space where each particle is a solution for the optimisation problem. the space in the search space is determined by each particle will be the best position. The global best particle is obtained when the neighbourhood is entire swarm and algorithm is gbest PSO.
If the neighbourhood is small then algorithm is the lbest PSO.the fitness function and its local best particle gives its performance. The parameters like particle current position, particle velocity and particle best position are the characteristic of each particle.
For communication of the particles the topologies like the ring, star, master slave can be used where star is the toppest for achieving balance as inertia is considered.
The random generation of particles take place initially and fitness is calculated. Then best and gbest values are determined which results in updation of velocity and position. This is repeated for several itrerations.the global optimal solution is obtained. The comparison takes place between the new task arrived and the task in the queue which changes the sequence and the particle gives the task scheduling decision.
Stopping condition: The algorithm stops when maximum number of iterations is reached.
C) Dynamic Critical Path Duplication Task Scheduling Algorithm
This scheduling algorithm uses schedule holes for advancing the earliest starting time and processing element utilization. The dominant task is found out .then a function is generated from condition such as first coming task scheduled first, larger computation cost task scheduled first, the task having larger reminder scheduled first. This function is known as priority function.
First variables are initialised. Then execution rate and communication rates on average basis is calculated.
Assumption of this algorithm is that one processor assigned only one task and the communication volumes between tasks determine the communication overhead.
Then DCPD algorithm uses average execution rates for estimating tasks blevel in bottom up manner. The task without predecessor is considered to be in ready set. Till the ready set is empty the following steps occur. Using priority function the priority of unscheduled task is determined. With the use of dominant task the maximum priority within ready set is selected. Then with the help of inserting policy it is given to a processing element. The dominant predecessors are obtained with its data ready time. If free time slot is there then schedule the duplicate dominant predecessor to that slot and advance with the inserting policy. Allocate to that processor and make that processor examined. Recheck the ready set. Repeat these steps.
The time complexity of DCPD is O ((|N|+|E|) |N||P|) where N is the set of task node, E is the set of communication edges and P is the set of hetrogeneous processing elements
D) Fuzzy Model
It utilizes the data received from the current state of processor where randomly generated tasks are provided on first-come-first-serve rule. In the model, the task is ready for assigning to the processor. Then local schedule uses bidding which makes the decision. The main feature of fuzzy model is that more precise conclusions can be generated by using vague and imprecise data.
The basic assumptions on this model are the processing elements must be homogenous in nature apart from this. The task must be non-primitive, independent as well as randomly generated and should have random arrival time and exclusive dead time. It involves 3 steps:
Fuzzification of input variable (antecedents)
Application of fuzzy inference rules
Defuzzification of the result
Where, fuzzification includes transformation of antecedents into fuzzy variables or linguistics. The antecedents are processor executions length (PEk.Pel), processor delay (PEkdel) and task length.
The second is providing the fuzzy inference rules to generate a consequent. The rule is in the form of IF antecedent 1 AND antecedent 2 then consequent. The consequent is the output which consists of 4 fuzzy variables Acceptance high, Acceptance Low (AL), Rejection Low and Rejection High (RH) based on numeric values from 0 to 1. The mapping of the antecedents to a graph is obtained next.
The final step is defuzzification which uses a method called centroid method. It gives a centroid value and these are compared and best value is declared as the first. This provides a good decision for the scheduling with large number of processor or task.
III. PRONS AND CONS
The algorithms discussed above have various advantages and disadvantages. By comparing the particle swarm optimization with genetic algorithm on the basis of task arriving at fixed time and having an execution time. The particle swarm optimization is cost effective, works more precisely, with better load balancing and exploration of search space than genetic algorithm but genetic algorithm has less iteration than PSO.
The fuzzy logic model works in a homogeneous distributed system uses an exponential distribution provides better task scheduling decision and has a better working performance with larger no of processor and task.
Considering the DCPD algorithm which out performs the existing list-based & clustering algorithm for scheduling have the advantage of improved performance in terms of scheduling length and utilization.it makes use of schedule holes and earliest starting time of some task can be advanced. Apart from this it has a dis advantage of high complexity because of backward search for duplicating the predecessor of selected task.
IV. COMPARITIVE STUDY OF ALGORITHMS
The comparison of the algorithms based on the processing element type, complexity, input and scheduling holes exploitation is done and is given below
EXPLOITING SCHEDULE HOLES
Fig 1.Comparative study of algorithms
In this paper a survey is done on the four of algorithms such as genetic algorithm, particle swarm optimization, fuzzy model and dynamic critical path duplication. Their pros and cons are discussed with a comparative study. From the study it can be concluded that each of the algorithms has its own prons and cons and canââ‚¬â„¢t be provided as one is superior to other. For the homogeneous system fuzzy logic is better and for the heterogeneous system particle swarm optimization and its variants provide the best solution.