This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
In this paper, we initially discuss the advantages of parallel computing over serial computing. Neural Networks has many advantages and then we decide upon the type of neural network that needs to be used for the prediction of the host load of a system for a grid environment. We achieve better results in terms of low overhead for different types of systems. We also observe that the standard deviation and mean for these systems is reduced by 60% and 70% by using neural networks. The training and testing time for the system is also very less and neural networks can be easily applied in a real time environment.
Traditionally, serial computing is defined where there is only a single processor to perform all the computations. So therefore the instructions are broken into smaller series of instructions and then these are solved each at a time therefore only one instruction can be executed at a time.With parallel computing we can have multiple processing units and each unit can run multiple instructions concurrently. Therefore different instructions will run on different processors. So by using parallel computing we can save time,money,computer memory and provide concurrency.
Grid computing is a branch of parallel computing and its working principle is designed in such a way that in a network which are using open standards different resources can be computed at the same time to achieve high standards. To achieve high quality performance these computations need to be scheduled for providing efficiency.
For grid computing a node needs to be chosen so that the running time of the task can be reduced. In this paper we predict the running time of the host and therefore it has been predicted that the host load of the system can be used to predict the running time .The host load is directly proportional to the running time and the CPUââ‚¬â„¢s availability can also be obtained.
Host load can also be discovered from many methods such as linear models[9,10] and the proposed tendency based models[11,12].These methods can be inaccurate as these methods do not take into account the dynamics of grid computing. We use the neural network approach to estimate the host load of the CPU. But before we perform such analysis we will first estimate if neural networks can be proven better than the traditional methods and secondly the cost of the performance measures such as validation, testing and prediction rate. Lastly,if the neural network can be applied to the real time grid environment.
Experiments are performed on neural networks where the values are collected on a period of ten days and are trained and then observed if the host load can be obtained for the next 10 days without the need for retraining. It is also required that we can produce low mean errors and these results can be applied to a real world scenario.
Area of investigation :
Force load connection in the grid environment. So we have used for different kinds of unix system APOX.APX7,SAHARA THEMES. We are using host load predication on this four using system by using neural network we also investigate these in neural network will work on real time grid environment.
Applications of the Area:
An artificial neural network is a method of developing information design which approximately works as human brain. In this technology normally the network will be running many number of processors in parallel.
In the context of ANNs, one partitions the data set randomly into two parts, the trainer set and the tester set. The model and its parameters are obtained using trainer set. Implementation of this structure is tested on tester set. If the model is found satisfactory over the tester set the model is considered valid enough for further use otherwise a new partition may be carried out, the whole process is repeated to get a more valid model. In neural networks the relationships between input and output pairs are construct through the learning process. The learning initially starts with a arbitrary weight values and learning rate, at each cycle the weights are adjusted based on the learning algorithm and the learning will continue until the network produces the good results for the training set. The weight vector which was constructed in training process will be tested on a different set to test the performance of the weight vector, if it gives the good results on testing set we can use that weight vector for future use.
We use two different neural networks one is the feedforward network and the other is the back propagation, these two networks can be used to predict the host loads in an environment. These networks can predict random inputs and can produce accurate mappings for the results.These networks provide simplicity and are easy to apply.These network can also be applied easily in a real time environment.
Feed Forward Neural Networks:
In the above figures we represent the feedforward neural network in which operations performed measure the host load in an environment.In the above figure is it observed that the network has 4 inpus where the external information can be stored and one output layer C where the solution can be obtained.The network input and output layers are separated by two hidden layers.Connections exist in the network which indicate the flow of the data between nodes.
The number of inputs in every layer are the same and the connections as in the figure are modified by different weights and the extra input required is given a constant value of 1.The bias value is used to modify the extra weight on the input.
Whenever the network is fed with input it performs its calculations and then it transfers the result to the next layer.
Where -- is the output of the current node ,n is the number of nodes in the previous layer, -- is an input to the current node from the previous layer , -- is the weight modifying the corresponding connection from -- and -- is the bias. In addition h(x) is either a sigmoid activation function for hidden layer nodes, or a linear activation function for the output layer nodes.
REQUIREMENTS FOR THE EXPERIMENT:
The Input Parameters:
For a neural network it is important that we first identify the input parameter correctly for performance reasons and these may include the number of hidden layers required in the network or the number of nodes in the network. We have therefore observed through trail and error and the previous experiments that the number of hidden layers two are sufficient and inputs can be more in number. The output required for this experiment is one and the nodes observed are in the ratio of 20:10:1,30:10:1,50:20:1 and 60:30:1.The neural network also requires a learning rate to train the network and the learning rates required range from 0.01,0.05,0.1,0.2 and 0.3.
The network parameter are then fed to the network in the form of data series for evaluation. We choose different load traces observed by Dinda on the Unix system such as axp0,axp7,sahara and themis. We define load here as the processes which are about to be run or the ones which are set in a queue by the scheduler. The load traces observed represent capture periods and machine types.
Because the sigmoid activation can take only bias values of either 0,1 we have to perform some preprocessing techniques to normalize the load trace values within this range. We then observe the standard deviation and means of the load traces for the confirmation of the normalized values.We apply the following normalization formula to x in each load trace.
Where -- and -- are the maximum and minimum value of each load trace respectively and lower bound and the upper bound are represented by --- in the interval of [0.1,0.9].The normalized load traces are represented in the form of a table.
The normalized load traces are divided into three different sets:learning set,testing set and validating set and each is divided into different percentages such as 50%,30% and 20%.The neural network is fed with learning rates to decide upon the connection weights required for the network. The normalized mean square error is measured using the validating set and is used to decide if the learning process needs to stopped. If the error rate is measured above 20% the training is stopped and the previous weight values are used.
Results of the Host Load Prediction:
In the following figures we observe the results the y-axis represents the normalized mean square error and the mean and standard deviation of the different load traces.The load value prediction error is calculated by using the formula
Here N is the number of load values and ââ‚¬" and ââ‚¬" are the predicted and actual values of the ith trace respectively.
The mean ,SD and the NMSE keeps changing with different learning rates. As it can be seen from the figure the NMSE has a value range of [3.4%,7.5%] in axp0,[3.8%,5.1%] in axp7,[3.5%,7.5%] in sahara and [0.7%,1.2%] in themis. The mean has values in the range [2.6%,3.2%] in axp0,[1.1%,1.8%] in axp7,[5.5%,12.8%] in sahara and [1.6%,3.7%] in themis. The architecture of the network and the learning rate applied influence the training time as shown in the table below.
The training set size also influences the training time of the networks. A large training set may take a long time to complete a network and a small size training set may be insufficient to complete the network on an average it is estimated that 100,000 training set is equal to 5days of host load. The testing and validating set are independent of any parameters unlike the training time. The following figure shows the average validating and testing time for each host load trace. If the data set is bigger the validating and testing time also become large. The validating and testing time also increase with the network architecture.
Comparison of the Results:
The mean and standard deviation are used to measure the performance between the presently constructed neural network and the previous networks.The networks compared are 20:10:1 and 30:10:1 both with the learning rate of 0.3.The following table shows the reduction in mean by 60% and standard deviation by 70%.The only reduction with previous methods is upto 30% in the case of sahara and the rate is much higher with axp0 and axp7.
The cost and performance of the neural network is studied with the help of load trace analysis for host load prediction. The neural network is proven much better than the previous methods such as mean and standard deviation performance with 60% and 70% and the number of training samples are 100,000 which can produce accurate predictions.
With the cost reduction and the prediction ability of the neural network it is clearly observed that the network can be applied in real time networks in a grid environment.