Prioritized Workflow Scheduling In Cloud Computing Computer Science Essay

Published:

Cloud Computing offers wide computation and resource facilities for execution of various application workflows. Many different resources involved in execution of single workflow. Cloud Computing offers highly dynamic environment in which the system load and status of resource changes frequently. As the workload increases with increase in Cloud Services and clients there is a need to handle these requests or jobs. It needs to schedule them first to execute on different available VMs. The execution of cloud workflows faces many uncertain factors in allocating and scheduling workload. The first step is to provide an efficient workflow allocation model by considering the client's requirements. Workflow scheduling model will schedule jobs in such a way that all the jobs will get executed taking minimal possible time, maintain QoS and satisfy client's requirements.

Introduction

Cloud Computing is a large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet [17]. Cloud computing can also be defined as the new way of computing or a new way of using hardware and software resources. In Cloud computing user by sitting on his/her computer system and by using Internet and an application (commonly browser) can access number of services provided by various cloud providers.

Lady using a tablet
Lady using a tablet

Professional

Essay Writers

Lady Using Tablet

Get your grade
or your money back

using our Essay Writing Service!

Essay Writing Service

One from many definitions of workflow, a Workflow is defined as the automation of a business process, in whole or in part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules [9].

A workflow models a process as consisting of a series of steps that simplify the complexity of execution and management of applications [13].

Scheduling is nothing but a set of task versus set of processors and a workflow scheduling can be defined as the automation in scheduling of workload. A scheduling can be categories into two categories: Job Scheduling and Job Mapping and Scheduling. Job Scheduling is what in which independent jobs are scheduled among the processors of distributed computing for optimization. A Job Mapping and Scheduling requires the allocation of multiple interacting tasks of a single parallel program in order to minimize the completion time on parallel computer system [12].

A task is a (sequential) activity that uses a set of inputs to produce a set of outputs. Processes in fixed set are statically assigned to processors, either at compile-time or at start-up. There are two types of scheduling: static and dynamic. In static load balancing, all information is known in advance and tasks are allocated according to the prior knowledge and will not be affected by the state of the system. Dynamic load-balancing mechanism has to allocate tasks to the processors dynamically as they arrive. Redistribution of tasks has to take place when some processors become overloaded [10].

RELATED WORK

There are various algorithms and models defined for workflow scheduling.

POSEC and Pareto Analysis

Especially in cloud, when to talk about Job mapping and Job mapping and scheduling, there are two algorithm which works on these jobs for the optimization scheduling: one is algorithm based on POSEC method [16] and the other is algorithm based on Pareto analysis. POSEC is Prioritize by Organize, Streamlining, Economizing and Contributing. The POSEC method prioritizes the jobs on the basis of their parameters like organize, streamlining, economizing and contributing and then categorize the jobs into four level keeping its urgency and importance as parameter. The scheduling of jobs is then applied on these levels for optimal execution. The objective of this algorithm is efficient time management and load balancing. There are Four Quadrants of Decision Making: It needs two types of Priority Scores to take decision, Urgency Score and Importance Score. Urgency Score given by Cluster Member of cloud. Importance Score is given by Cloud Resources Manager [12]. There are Four Quadrants of Decision Making:

Level 1: Low Urgency & Low Importance

Level 2: Low Urgency & High Importance

Level 3: High Urgency & Low Importance

Level 4: High Urgency & High Importance

According to Pareto Analysis[16] 80% of tasks completes its execution taking 20% of time and rest 20% jobs will take up rest 80% of time for their execution. This principle is used to sort tasks into two parts. According to this form of Pareto analysis it is recommended that tasks that fall into the first category be assigned a higher priority. The 80-20-rule can also be applied to increase productivity: it is assumed that 80% of the productivity can be achieved by doing 20% of the tasks. If productivity is the aim of time management, then these tasks should be prioritized higher. If the higher priority jobs are put into first 80% jobs category then the execution of jobs takes very less time as the important or prioritized jobs will execute first and thus model build is more optimal. It has been found that algorithm based on Pareto Analysis take less time to execute the same set of jobs as executed using algorithm based on POSEC method [12].

Hierarchical cloud workflow scheduling schema

Lady using a tablet
Lady using a tablet

Comprehensive

Writing Services

Lady Using Tablet

Plagiarism-free
Always on Time

Marked to Standard

Order Now

The Cloud workflow system can coordinate multiple job submissions over cloud services. The goal of Cloud workflow scheduling schema is to make sure the proper activities are executed by the right service at the right time. Another way to achieve optimal schedule is by using hierarchical cloud workflow scheduling schema. According to this schema the whole workflow scheduling is divided into three stages: at very first stage whatever job requests are coming, it look for the parallel jobs and then splits all the parallel jobs. Each job needs some resources for its execution during its second step matching of jobs with corresponding candidate services takes place. This means the resources are assigned to different jobs as per their requirements. And in the third/last stage a scheduling algorithm is applied for the execution of these jobs. This Scheduling algorithm can be any depending on the requirements and for the optimality. This Algorithm focus on the hierarchical Cloud service workflow scheduling, Cloud workflow tasks parallel split, syntax and semantic based Cloud workflow tasks matching algorithm, and multiple QoS constraints based Cloud workflow scheduling and optimization, and also presents the experiments conducted to evaluate the efficiency of our algorithm. Using Heuristic Generic algorithm this scheme can achieve an optimal workflow schedule [5].

One-port model and Multi-port model

Specifically for the linear workflow, a linear workflow is what in which dependencies between stages can be represented by a linear graph, and to schedule such workflow two methods are used: one-port model and multiport model. One-port model is one in which each processor can perform computation, receiving incoming task or sending output one at a time only. There is no parallel processing in one-port model but in case of multiport model a processor can perform multiple operations like computation, receiving input etc at one time. Multiport model allows multiple incoming and outgoing at the same time. These two algorithmic models are useful for linear workflow optimization and helps in minimizing the latency [8]. This will lead to an optimal workflow scheduling for linear workflow.

Activity based costing in cloud computing

Activity based costing in cloud computing is another model for optimal workflow scheduling. This Activity based costing is the way to measure both cost of the object and its performance. According to Activity based costing model, a task can be evaluated separately on the bases of their resources, space and time taken to completely execute, as shown in the figure below. A job can be categorized on the basis of Available and Partially Available factor.

An Available job is one whose all required resources for execution is available at same data center only and a Partially Available job is such whose required resources for its execution is not present on single data center which means resources are scattered among different data centers. For a Partially Available job resources need to be collect from different data centers for its execution. The jobs are further subdivided on the basis of their dependencies as Dependent and Independent jobs [4]. The scheduling can be done at this bottom level for the optimality of workflow scheduling.

Two levels task scheduling mechanism based on load balancing

Two levels task scheduling mechanism based on load balancing is one way for task scheduling optimization in cloud computing system. According to this method tasks are scheduled at scheduling optimizer and this scheduling optimizer take information from system model and predicted execution time model which keep track of all the resources and predicted execution information. Scheduling optimizer itself checks whether if the current prepared scheduled is optimize or not if not it will regenerate the optimal schedule [15]. Using this model the execution time decreases and also the resource utilization increases.

PROBLEM FORMULATION

A cloud computing is a very big network where the millions of users accessing thousands of servers all times. These Servers may be present at single place or may be at different geographical places. These users send their request onto cloud server for processing and the execution of tasks/jobs are done at the cloud server. As the numbers of users are very large thus their requests in the form of tasks/Jobs, will also be very large. Scheduling the tasks/Jobs at the server end is very difficult for a server, because the requesting task/jobs are very large in number and each requesting job/task needs some computing or storage space resources to get executed. Here its work of a Scheduler to allocate required resources to the requesting job/task. The schedule thus build by the server must be optimal and good enough so that each request by the user gets response in time, and every Task/Job gets proper resources for its execution.

PROPOSED ALGORITHM

Lady using a tablet
Lady using a tablet

This Essay is

a Student's Work

Lady Using Tablet

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Examples of our work

To overcome this scheduling problem and provide each job/task a better resource with minimum execution time, a Prioritized Workflow Scheduling Algorithm is proposed and implemented. The algorithm works on following three steps:

Step 1: whatever jobs/tasks are coming first of all cluster these jobs on the basis of their attributes.

Step 2: within these clusters apply priority, which means each job/task will assign some priority and on the basis of their priority, higher priority job/task get executed first.

Step 3: after prioritizing these jobs/tasks within the clusters, assign these jobs/tasks to particular number of VMs which are capable of performing operations and get these jobs/tasks execute.

C:\Users\Rajwinder Singh\Desktop\model1.jpg

Fig 4.1: A proposed model (Prioritized Workflow Scheduling)

SIMULATION AND RESULT

Simulation Description

CloudSim v3.0.2 is used to implement Workflow Scheduling in Cloud Computing. The simulation is performed on a computer running Window 7. The configuration of computer is as Processor- Intel® Core™ i3-2350M CPU @ 2.30GHz 2.30GHz Processor, RAM- 4GB DDR3 Main Memory, HDD - 500GB 5400RPM Hard Drive.

Virtual Machine: A virtual machine (VM) is a software implementation of a machine (i.e. a computer) that executes programs like a physical machine. A virtual machine was originally defined as "an efficient, isolated duplicate of a real machine". Current use includes virtual machines which have no direct correspondence to any real hardware.[1]

Table 5.1: Configuration of VMs

Configuration

VM

Ram

512 MB

No. of Processors

1

MIPS

250

Band Width

1000

Storage Space

10000MB

Processor Type

Xen

Cloudlet: Cloudlet will work as Input job/task to the Cloud Environment. Cloudlet is an extension to the cloudlet. It stores, despite all the information encapsulated in the Cloudlet, the ID of the VM running it.

Table 5.2: Comparison of Execution Time with constant number of VMs

Sr. No.

No. of Cloudlets

Execution Time using Proposed approach

Execution Time using Simple approach

1

10

92.28

98.74

2

50

98.88

100.04

3

100

150.74

188.8

4

150

234.92

272.84

5

200

284.78

349.98

6

300

409.06

496.44

7

400

528.4

678.06

8

500

677.28

742.3

9

600

779.23

871.8

10

700

894.48

997.1

11

800

1090.34

1176.44

12

900

1177.71

1313.89

13

1000

1292.01

1452.34

14

1100

1427.39

1602.43

15

1200

1548.36

1742.15

16

1300

1647.79

1872.67

17

1400

1801.95

2026.66

18

1500

1917.78

2151.2

Table 5.2 shows the Comparison of Execution time of cloudlets when executed by applying prioritized workflow algorithm with the sequential workflow execution algorithm keeping constant number of VMs as 50. Fig 5.1 shows a graphical representation of comparison between two algorithms keeping number of VMs constant.

Fig 5.1: Comparison of Execution Time with constant number of VMs

Table 5.3: Comparison of Execution Time with constant number of Cloudlets

Sr. No.

No. of VM

Execution Time using Proposed Approach

Execution Time using Simple Approach

1

5

6063.79

6143.41

2

10

3105.64

3198.25

3

20

1598.21

1694.14

4

30

1151.5

1215.3

5

40

835.88

908.93

6

50

668.52

756.73

7

60

595.74

652.52

8

70

516.78

582.47

9

80

480.03

554.33

10

90

407.73

488.54

11

100

369.57

431.06

Table 5.3 shows the Comparison of Execution time of cloudlets when executed by applying prioritized workflow algorithm with the sequential workflow execution algorithm keeping constant number of Cloudlets as 500. Fig 5.2 shows a graphical representation of comparison between two algorithms keeping number of Cloudlets constant.

Fig 5.2: Comparison of Execution Time with constant number of Cloudlets

CONCLUSION

After the simulation, it is observed that the Prioritized workflow scheduling in cloud computing improves the execution time of jobs/tasks when compared with simple First come first serve (sequential) approach. The execution time is improving with increase in number of requesting jobs/tasks.