Analysis Of Learning Rate Using Bp Algorithm Computer Science Essay

Published:

In recent times, artificial neural network has seen the enormous growth in its fame. The objective of the research is to analyze the learning rate using BP algorithm for hand written digit recognition application. In this paper variations of Back Propagation (BP) algorithm of Artificial Neural Network (ANN) are used. The results are obtained using two variations of BP algorithm; simple BP and BP with momentum. Different patterns of handwritten digits are used to analysis the performance of BP algorithm. Various parameters such as learning rate, number of hidden neurons in hidden layer layers, and number of training runs are used during the analysis of the BP algorithm. The parameters of BP algorithm are used to analyze the learning rate which shows great achievement for hand written digit recognition application. Simulation results show that the learning rate has great impact on the performance of ANN.

ANN has been intensively used to solve the complex engineering problems from more than last three decades. Many applications have been investigated by many researchers and solutions have been presented. However, there are still a lot of issues that have to be addressed by the research [24]. Amongst them optimization of the ANN is a prominent issue [26] [25].

Lady using a tablet
Lady using a tablet

Professional

Essay Writers

Lady Using Tablet

Get your grade
or your money back

using our Essay Writing Service!

Essay Writing Service

In ANN the basic element is the neuron which is from the nerves system of the human. In computer, neuron can be a computational and communication function [17]. The basic purpose of the ANN is the intelligent computing like human intelligence [17][13]. We can call neurons a processing element or computer.

The flow of information in ANN is in parallel manner while the knowledge is distributed among the processing units or neurons [13].

The current work is going through real time applications using ANN. The main area in real time applications is the architecture optimization and the accuracy of the required results [15][26][14]. Following points are important in the neural network applications:

Learning rate

Accuracy of results

Momentum term

Number of iterations used to train the neural network model.

Architecture of the Neural Network

Number of hidden neurons in the hidden layer.

In this paper analysis of learning rate is presented for hand written digit recognition problem using BP algorithm. The size of the step in the training of ANN is controlled by learning rate. Improper selection of learning rate may cause local minimum problem or large training runs as a result decreases performance of ANN[24][29]. This parameter of the neural network has significant affect on the accuracy of the network.

The ANN models could be Feed backward or Feed forward [13]. In real time applications the Feed forward model provides good results [27][15]. Feed forward ANN could be linear or non-linear. The non-linear ANN could be supervised or unsupervised. In this thesis supervised ANN, BP model is used. The ANN model needs training to be the intelligent [13][27]. For training, different algorithms of the ANN are used. Learning actually acts to govern the procedure of changing parameters used in ANN model [15][17]. The most important thing in the ANN models is the adjustment of the weights [13]. ANN algorithm is used to adjust these weights. For better performance of the ANN models, weights are regularly adjusted.

In this paper simple Backpropagation and Backpropagation with Momentum are used. The main difference in these two algorithms is the way by which weights are adjusted. The Backpropagation with momentum uses an additional factor of momentum.

The remainder of the paper is organized as follows: in next Section, we discuss related work in field of ANN, BP algorithm and optimization techniques. Section 3 describes proposed architecture. Section 4 describes training and testing while section 5 presents results of experiments related to hand written digit recognition. In section 6, conclusion of the paper is presented in this paper.

2. Literature survey:

Analysis of learning rate in ANN is very important. It involve many of the unique features that are quite different from the common practices of ANN. Learning rate has been used frequently in ANN but not analyzed for any application. The models of the applications of the ANN are expensive to create and applicability is limited as well. For these reasons learning rate should be analyzed for some applications and especially for pattern recognition applications. But in past few years, efforts have been made for pattern recognition and particularly with character and digit recognition.

Lady using a tablet
Lady using a tablet

Comprehensive

Writing Services

Lady Using Tablet

Plagiarism-free
Always on Time

Marked to Standard

Order Now

In Backpropagation neural network, the learning rate parameter can have a significant effect on generalization accuracy [27][18][15]. The selection of small or large learning rate affects the generalization accuracy and training the neural network architecture directly [27][18].

There is need to select an optimal learning rate without user interaction [25][27]. Choosing a learning rate has always been a bit of a magic art for neural network practitioners [18][27]. The back propagation algorithm proves to be a significance breakthrough in neural network research [13][15][27]. To check the performance of a neural network algorithm, variable number of hidden neurons used in hidden layer [13]. A neural network algorithm with one hidden layer is proved to be a fast and efficient [13].

In [10] for Italian Continuous Digit Recognition, the training of the artificial neural-network done with standard Backpropagation on a fully connected feed-forward network. The results obtained are 98.68% WA (word accuracy) and 90.76% SA (sentence accuracy) on the test-set.

In [9] the author have used Neural Network classifier and statistical method for feature extraction of hand written digits and gets of results of the accuracy of the results upto 98% accuracy.

In [14] the author has used hand written digit recognition for Neural Network Chips and Automatic Learning. The results come into view to be the state of the skill in digit recognition. The writer concludes that a general purpose neural network chip can be integrated as an accelerator in a bulky network.

In [15] the author uses tangent vectors for improving handwritten digit recognition accuracy. And then the results are compared with the original results and 2% improvement in the accuracy is noted.

In [16] the author uses 3-stage classifier for hand written digit recognition, at stage 1 and 2 neural network is used and at stage 3 support vector machine is used. The recognition rate obtained is among the best on MNIST database. These results were also better than single SVM using same feature set.

3. Proposed architecture:

The proposed architecture for analysis used in this paper is shown in figure (1), and the parameters used are shown in Table (1).

The architecture of the neural network used for hand written digits is shown in the figure (a).

Figure (1). Hand Written Digit Architecture

The characters used are in the form of matrix of size 10x10. The architecture parameters of figure (a) are; the input neurons are 100; the hidden neurons are varying such as 6, 8, 10, 12 and 15; the output neurons are 10. The Activation function (A.F) used at hidden neurons and output neurons is sigmoid function.

4. Training and Testing of

Proposed architecture

4.1 DATA Sets

The characters that are being used here are Digits written in the form of a matrix of 10Ã-10. The library of hand written digits is prepared by my self, which contains 1000 hand written digits. The digits used in the library are written by different persons. The square blocks of 10Ã-10(as shown in figure), are printed on pages and then different peoples have asked to write Digits in these blocks.

4.2 PREPROCESSING

The digits written in the square blocks of 10x10. Each writer was asked to write on writing area. No restriction was imposed on the content or style of writing. The writers consisted of university students, professors, and employees in the university. The digits written in square block are processed using Matlab. The characters were scanned and then converted into 0,s and 1,s where 1 represents the presence of written material and 0 vice versa.

4.3 TRAINING SETS

Training sets are used to train and adjust the weights of ANN. It does not want to make the ANN too specific, making it give precise results for the training data, but incorrect results for all other data[21][22]. When this happens, it is called that the ANN has been over-fitted [13]. The total number of handwritten digits used are 1000 digits, collected from different persons are used as training data sets.

4.4 VALIDATION sets

Validation sets are the sets to avoid the over fitting in the training data, an ANN without validation set is likely to be over fitted to the training data[13][21]. The 100 data sets are used as validation sets in this thesis Validation sets shows that the designed neural network is trained properly.

4.5 TESTING sets

Lady using a tablet
Lady using a tablet

This Essay is

a Student's Work

Lady Using Tablet

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Examples of our work

The proposed algorithm is evaluated on samples taken from individuals who did not participate in the initial process of setting up the training data set. For effective training of the designed network (to avoid over-fitting), the testing sets of 250 characters are used randomly.

5. Results And analysis:

In the training phase of the ANN, 1000 hand written characters are used as training patterns, 100 patterns of different characters are used as a validation pattern to check The accuracy of results for hand written digits is shown in the form of tables.

Table (2): Results of Hand Written Digits using BP with momentum algorithm with 0.1 value of momentum

Using Simple BP algorithm

The results of table (1) are obtained using simple BP algorithm.

Table (1): Results of Hand Written Digits using

simple BP with momentum algorithm

Accuracy of

results

Learning Rate

0.1

0.2

0.4

0.6

0.8

0.9

Number of hidden Neurons

6

90.54 %

92.62 %

93.74 %

94.17 %

93.05 %

95.71 %

8

93.73 %

96.03 %

97.63 %

98.25 %

98.66 %

98.67 %

10

94.89 %

96.53 %

97.83 %

98.39 %

98.77 %

98.87 %

12

93.68 %

96.84 %

98.32 %

98.63 %

98.71 %

98.95 %

15

94.77 %

96.77 %

97.97 %

98.52 %

98.80 %

98.99 %

The number of iterations to get the results of table(2) are constant to 150

Figure (2): Accuracy of results against learning rate according to table (1)

Analysis: To support the hypothesis of the abstract the analysis of the tables (2, 3) and figures (3, 4) is as given below.

The value of accuracy of results changes as we increase or decrease the value of learning rate for constant number of training iterations.

As we change the number of hidden neurons in the hidden layer; by increasing or decreasing the value of learning rate the accuracy of results also changes for constant number of training iterations.

5.2 Using BP with momentum algorithm

The results of table (2) are obtained using BP with momentum algorithm.

Accuracy of

Results

Learning Rate

0.1

0.2

0.4

0.6

0.8

0.9

Number of hidden Neurons

6

93.93%

96.53%

97.97%

98.64%

98.45%

98.65%

8

95.35%

96.93%

98.11%

98.79%

98.96%

99.13%

10

96.35%

97.27%

97.95%

98.40%

98.83%

98.82%

12

95.89%

97.07%

98.57%

98.88%

99.07%

99.00%

15

96.42%

97.71%

98.39%

98.74%

98.93%

97.71%

Figure (3): Accuracy of results against learning rate

according to table (2)

Table (3): Results of Hand Written Digits using BP with momentum algorithm with 0.2 value of momentum

Accuracy of

results

Learning Rate

0.1

0.2

0.4

0.6

0.8

0.9

Number of hidden Neurons

6

95.56%

96.66%

98.18%

98.35%

98.01%

98.12%

8

95.60%

97.44%

98.41%

98.92%

99.09%

99.00%

10

96.43%

98.23%

97.84%

97.43%

98.76%

98.95%

12

96.51%

97.79%

98.69%

99.08%

99.07%

97.12%

15

96.88%

97.86%

98.18%

98.68%

99.02%

99.10%

Figure (4): Accuracy of results against learning rate according to table (3)

Figure (4): Accuracy of results against learning rate

according to table (3)

The value of momentum term normally takes smaller value [13]. So the results are shown for the only two values of momentum term 0.1 and 0.2.

Analysis: To support the hypothesis of the abstract the analysis of the tables (2, 3) and figures (3, 4) is as given below:

The value of accuracy of results changes as we increase or decrease the value of learning rate for constant number of training iterations.

Changing the number of hidden neurons in the hidden layer then by increasing or decreasing the value of learning rate, the accuracy of results changes for constant number of training iterations.

Changing the value of the momentum term and by increasing or decreasing the value of learning rate, the accuracy of results also changes.

6. Conclusion:

This paper conclude that the learning rate effects the accuracy of results for hand written digit recognition problem. We observe that in simple BP algorithm, as we increase the learning rate the accuracy of results changes and as we decrease the learning rate the accuracy of results also changes. Different number of hidden neurons in the hidden layer are also used and by changing learning rate, the accuracy of the results also changes. And also in BP with momentum, as we change the value of learning rate, the accuracy of results also changes for different values of momentum term and for different number of hidden neurons in the hidden layer.