This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Abstract- This paper discusses the classification of different neural networks in terms of their activation functions. These include Threshold logic unit, Hopfield net, ADALINE, Perceptron and its types. It also includes the various types of learning algorithms that include Back-Propagation Algorithm, Support Vector Machines and Logistic Regression. The Analysis and Discussion section of this paper explains the best solution for Optical Character Recognition.
Index Terms- Neural Network, Optical Character Recognition
A neural network is a made up of two or more layers of neurons. Each neuron acts as a black box containing an activation function. Given a set of input value and their respective weights and the threshold value or the bias value, the neuron module calculates the output. 
Section two describes the different types of neural network classified according to the activation function that are generally used by them. It discusses each activation function in detail elucidating the pros and cons of each neural network.
Section three discusses the learning algorithms that can be used only with more than two layers of neural network. This section discusses the different types of algorithms and enlightens the importance of each.
Section four analyses and discusses the application of each neural network and the learning algorithms in its way and explains which method may be most suitable for Optical Character Recognition and how.
There are many different kinds of neural networks, with different names and different implementation.  However the two major classifications of the types of neural network are:
Single Layered Neural Network
Multiple Layered Neural Network
The single layered networks generally use Heaviside step or linear activation functions. Whereas, the multiple layered networks generally use the sigmoid function or similar functions that can easily be differentiated.
Each neural network, calculates the weighted sum, which is defined as the sum of each multiple of the inputs and their corresponding weights :
Weighted Sum = ∑ ( inputs i * weights i )
Figure : Each input value is assigned a weight 
Heaviside Step function
Step is a very special function that describes a graph that consists of series of line segment. This was the first type of activation function that was used for neural networks. The function and its graph are shown below:
f (x) = [[x - 1]]
Figure : A graph of a step function 
The graph above shows how the output of the function steps from one point to the other. This step may be as small as 1 or as big as 5 units. The output of the function is always a real number. For example, if the calculated value of the activation function meets the threshold, then the output is 1 else zero.
Threshold Logic Unit (TLU) and Hopfield Net are two examples of neural networks that use Heaviside Step function.[4, 5] The difference between the two networks is that the Hopfield net uses the binary threshold units- it basically converges the output to the local minimum value (the gradient descent) that is, zero or one whilst the TLU was more generalised.
Linear activation function
The graph of the linear activation function is a straight line. An example of the function and its corresponding graph is shown as below:
f (x) = 2x +2
Figure : Graph of a linear function 
As you can see above, linear combination is basically a combination of linear transformation and translation. It allows a vector addition as well as a scalar multiplication within its function. The output of such a function is the sum of weighted sum of inputs and the bias term.
The two most common types of neural networks that implement this kind of function are the Perceptron and ADALINE. Perceptron is the simplest form of feed forward neural network.
Both are single layered neural networks that have the ability to learn, though the main difference between both the neural networks is that ADALINE network adjusts its weight according to the weighted sum , whilst the Perceptron network adjusts it using the linear activation function. 
Sigmoid activation function
The sigmoid function shows a similarity of a stretched cosine cure. The function and its graph is shown as below:
f (x) = 1 / (1 + e-x)
Figure : Graph showing a sigmoid curve 
The sigmoid is a simple non-linear function that has a region of uncertainty. This means, as shown in the graph above, that the corresponding output is not clearly deterministic.
The sigmoid curve is mainly used within multiple layered neural networks. The major example of the neural network using this function is multilayer Perceptron.  Its ability to be easily differentiated helps in the re-learning characteristic of neural networks.
There are many kinds of algorithms available that allow the re-learning of the neural network possible and efficient. The main type of algorithm that helps the network in relearning is the back propagation algorithm.
The Back Propagation Algorithm
It requires the desired output to train its network. Once it gets the wrong output, it goes back to the weights, updates them accordingly and then re-calculates until it gets to the desired output. [8, 9]
Figure : A neural network implementing back propagation algorithm. 
Support Vector Machines
The support vector machine basically separates the sample size with respect to some major weighted vectors. These vectors are also known as the support vectors. They separate the vectors in a way that distinguishes the required output from the actual output. 
Figure : How support vector machines work 
This algorithm basically predicts the output given the inputs and certain conditional facts. It works on the probability of the output to be correct. Each time it learns the prediction level becomes efficient and the algorithm gives a better output.
Each category of neural networks below shows some fact stats based on the inputs given to the functions.
Heaviside Step Function
The input was a set of zeros and ones. The threshold was set to 0.5, and each input value was given assigned weights. The neural network had helped giving the correct output for the OR function. Basically, the function calculates the f(x) values as:
Table : Step function f(x)= [[ x - 1 ]]
The linear function operates as below:
Table : Linear function of f(x) = 2x+2
Once an error is determined, the line on the graph translates itself towards the right or the left and changes its gradient in such a way that it is closest to the output.
The basic output of the sigmoid function is as below:
Table : Sigmoid curve of f(x) = 1/ (1+ exp (-x))
However, when the curve is shifted using the differential of the sigmoid function, it gives an uncertain output. The output cannot be predicted and acts as the closest to the brain of a human.
Back Propagation Algorithm: It performs number of iterations before getting to the correct output. It requires a large amount of memory though its accuracy level is quite high.
Support Vector Machines: It works well in most situations if it has chosen the correct support vectors. There are situations where the support vectors are such that the output is jagged throughout giving a vague output.
Logistic Regression: It gives an unpredictable output and thus initially the reliability of its results is poor.
Having said all that, I must say a Back propagation neural network that uses the sigmoid function as its activation function would work best in almost all situations.
However, as the above discussion claims, even a single layer network using the Heaviside step function would work. Rather that would work best.