An artificial neural network is as defined by Techopedia, 2010, as a computational model based on the structure and functions of biological neural networks. Information that flows through the network affects the structure of the ANN because a neural network changes - or learns, in a sense - based on that input and output.  What this means is ANN is capable of 'learning' from observing sample data sets and adjusting weights through algorithms. These weights determine if the signal will be passed along or stopped at the next layer. The most common, and well understood, forms ANNs in current use today are using three layers.
A basic ANN (St Louis Univeristy)All three of these layers are connected. The basic system operates similar to figure 1 where we initially have some input neurons. These neurons interact with the second hidden layer. This hidden determines if the signal should be sent forward, stopped or sent elsewhere. After weighing the inputs the hidden layer then sends the signal forward to the outputs. However a true ANN uses separate 'hidden layers' to backpropagate errors and to modify outputs more accurately.Layers.gif (3872 bytes)
The origins of artificial neural networks began in the late 1800s as research into understanding the human brain and how it worked. In late 1943 two researchers, by the names of McCulloch and Pitts, produced the first accurate model of a neuron. The model that was created then is still used in ANN modelling today. But, the actual creation of ANNs can be attributed to Martin Minsky in 1951. However the most important discovery relating to ANNs was the creation of the backpropagation algorithm by Werbos between 1974 and 1986. This algorithm allowed errors to be passed backwards and later eliminated by the changing weights of the system. Today artificial neural networks are used as an accurate cheap and cost effective form of mathematical modelling that only requires several data samples to arrive at outputs or solutions.
II. How to ANNs can Learn Through the Adjustment of Weights by Algorithms
Artificial neural networks gain their unique flexibility from the multiplication and propagation of input signals by weights. A weight can be thought of as the strength of a respective signal while the input can be thought of as a synapse. As the neuron receives signals that pass a certain threshold and are strong enough from a synapse, the neuron is activated and it emits other new signals down the line . These signals eventually reach the output. Weights can also be negative causing the signal to be inhibited. Through the combination of neurons it becomes possible to process information.
For higher weights, the input is multiplied by a larger value. Depending on the weights used in an ANN, the system will have different outputs and computations. We can however influence these output of the system for specific inputs by manipulating the weights through the use of algorithms. In large, non primitive, ANN systems which possess hundreds or even thousands of artificial neurons, across multiple hidden layers, finding all the individual weights would be a long and hard process. As stated previously this is where the addition of algorithms can be beneficial. Algorithms in ANNs can provide a means of adjusting the weights in order to get the desired output. By setting specific algorithms the system now begins to 'learn' or be 'trained'
The most common form of learning in ANNs is through the backpropagation algorithm. This algorithm is commonly used in layered feed-forward ANNs . This means that the neurons are organised in layers similar to figure 1. In this system signals are sent forward and errors are 'propagated backwards', hence the name backpropagation. Both the inputs and outputs are given their own layers, first and last respectively, with the intermediate layers called the Hidden layers. Backpropagation uses a form of training called supervised learning. This term means that the user provides the algorithm with examples of ideal inputs and outputs, which the network must compute. This allows the error of the system to be calculated as the error is the difference between the theoretical and actual results. The system then selects random weights and begins adjustment to the ideal criteria. This method is however problematic with many layers. Backpropagation takes time to complete and with each added weight, not to mention the addition of new neurons or layers causes the process of setting the weights to grow exponentially. The greatest advantage of backpropagation is that no explicit programming is used and it is an adaptive method. Other methods exist but are not in the scope of this research paper.
III. Types of common ANNs
ANNs exist in several common forms. Most often however computers will self define the entire network from some loose user conditions. The software will build, mathematically describe and optimise the network for the user to review.
These networks have only one condition; that signals will always move forward. The signal can only flow from the input to the output signal. There are no other limitations placed on this type of system.
Figure :Feed-forward ANN Recurrent ANN:
Figure : Recurrent ANN This ANN is similar to a Feed-forward network in the fact that there are no limitations on how the signals propagate, just that this time the signals can back-loop to any previous neuron. This system exhibits dynamic behaviour. These networks can use their memory to process any sequence of inputs. The most basic form of this network is shown in figure 3, where every neuron is connected to all other basic neurons in all directions.
Figure : Hopfield ANN This type of network involves the storage of one or more stable target states. These stored states act like memory recall devices when similar signals are sent through. The state is recorded as a binary input that only activates if the threshold is exceeded.
Elman and Jordan ANNs:
Figure 5: Elman ANN 
Figure : Jordan ANNThis is also referred to as the simple recurrent network . However this network is a special case of recurrent networks. This system has the added benefit of memory storage allowing for the detection and generation of time varying patterns. The difference between Elman (figure 5) and Jordan (figure 6) is that the feedback occurs at the hidden layer and output layer respectively. Both of these systems can also respond to spatial and temporal patterns.
Long Short Term Memory:
Figure 7: Long Short Term Memory ANN This is another special case of a recurrent ANN. This system is specialised in learning from experience. As such the system is uniquely suited to process, clarify and predict time series with long, unspecified lag times between important events . This attribute makes long short term memory networks the most efficient network of the time recurrence series. This is achieved by utilising blocks that are capable of remembering values for any length of time. By using memory gates the system only remembers the most significant inputs, from this the system may choose to forget the value or output said value.
Figure 8: Self organising map This is a special form of ANN that is similar to a forward-feeding ANN, but with radically different architecture. The most common arrangement of neurons is in a rectangular or hexagonal grid. This type of ANN utilises neighbouring functions to preserve input states. This system produces a form of unsupervised learning to 'produce a low-dimensional, discrete representation of the input space of the training samples, called a map' . These maps produce low-dimension inputs to form high-dimension outputs. The purpose of such a system is to detect similarities and correlations within data input sets and to adapt future responses for the output values.
Other Types of ANNs exist but they are outside the scope of this research paper.
IV. What do the Hidden Layers and 'Neurons' Accomplish?
As artificial neural networks are a relatively young field of research and as such there exists no known rule to find how many hidden layers should be in a system and how many neurons are contained within these layers. In some neural networks (NN) where the system is using a generalised linear model there is no point in using a hidden layer at all as this will increase computation time and reduce the resolution of the results. Other mildly nonlinear systems may also not benefit from the hidden layer if the program requires generalisation, has few data inputs, or has too much noise.
As stated above, hidden layers accomplish the actual processing between the inputs and outputs. This is done via weighted connections, where in general, each neuron is connected to every other neuron of the next layer. Figure 3 shows this interconnectedness. There exists specific algorithms that can reduce the number of neurons by checking how each neuron affects the overall output result.
An MLP (Multilayer Perceptrons) is a feedforward ANN that is used to find the solution of different problems, including but not limited to, pattern recognition, and nonlinear interpolation. These NN require several nonlinear hidden layers to be able to produce accurate results at sufficient resolutions. The first hidden layer is unusually large and is often referred to as the 'universal approximation' layer. A second layer may be used to remove some of the inherent error or instability of the system to produce a more accurate result. However it is uncommon currently to have any ANN system having more than three layers. Certain architectures require a minimum of two layer to function correctly such as the cascade correlation, two-spirals problem and programs that recognise addresses and zip codes. Other uses of multiple hidden layers include RBF networks (Radial Basis Function). When adding a second layer into this normally single layer network, the network now recognises irrelevant inputs into the system. This extra 'linear' hidden layer causes RBF network to become elliptical and then radial, as a result the root mean squared error can be significantly reduced.
Now that we have described why multiple hidden layers are used, we will look at how many neurons are needed per layer. There is no current equation describing how many neurons should be in a hidden layer, however there exists some established guidelines in defining the architecture.
The number of input and output units
The number of training cases
The amount of noise present in the inputs
The complexity of the learning algorithm
The base architecture
The type of hidden unit activation
The 'training' algorithm
These guidelines give rise to several 'rules of thumb' such as:
"A rule of thumb is for the size of this [hidden] layer to be somewhere between the input layer size ... and the output layer size ..." (Blum, 1992, p. 60). 
"To calculate the number of hidden nodes we use a general rule of: (Number of inputs + outputs) * (2/3)" (from the FAQ for a commercial neural network software company). 
"you will never require more than twice the number of hidden units as you have inputs" in an MLP with one hidden layer (Swingler, 1996, p. 53). 
"How large should the hidden layer be? One rule of thumb is that it should never be more than twice as large as the input layer." (Berry and Linoff, 1997, p. 323). 
"Typically, we specify as many hidden nodes as dimensions [principal components] needed to capture 70-90% of the variance of the input data set." (Boger and Guterman, 1997) 
However these rules ignore the number of training cases, noise in outputs and the algorithm complexity. A more intelligent general rule is that there should always be more training cases than weights, however this number can vary anywhere between 2-30 times depending on what the system is trying to accomplish. If the number of training cases is too high the system will overfit or too low and the system will under-fit. These numbers can also vary depending on whether the system uses regularization or has a large amount of noise present. The other rule/ guideline stated by J. Heaton, author of the Introduction to Neural Networks for Java, is that 'the optimal size of the hidden layer is usually between the size of the input and size of the output layers' a more empirical approach to this guideline is that the number of neurons in a hidden layer is the mean of and output layers in a relatively simple system. Even so, it appears that trying many different networks with differing numbers of hidden neurons and finding the generalisation error is the best method.
In any ANN there can be any number hidden neurons in a hidden layer, while this reduces the error and gives a more desirable result, computational training time is increased exponentially. As a result it can become a trade off between speed and accuracy or simply finding the optimisation of the system.
V. Reducing the Number of Nodes and Weights of a System for Optimisation
The most common form of optimisation in an ANN is through the use of a process called pruning. Pruning is a set of techniques where by neuron nodes are trimmed to increase computational performance.  The technique involves removing neurons from a network during training. This is done by identifying the nodes which if removed would have almost no effect on the network's resolution, improving performance. These nodes can be easily identified by observing the matrix after several rounds of training. The weights on the neuron will be close to zero. Pruning can be done either manually or through an algorithm. Manual removal just involves deleting the associated node. In using an algorithm however the network may remove more neurons than is necessary, causing the system to lose resolution. Such a problem can be fixed by defining more neurons or using a lighter two step pruning algorithm.
VI. Applications of ANNs
Artificial Neural Networks can be applied in almost every field of technology, ranging from market simulations to complex medical processes. This section of the research paper will take a look at several applications of ANNs, in different fields, and the types of inputs involved.
Stock market predictions:
ANNs can be used to predict future movement of Stocks by mapping trends using previous historical data.
Input data: Opening, max, min, closing and volume data, as well as, the general market index and other related products.
Economic Indicator Forecasts:
Governments around the world utilise ANNs to help predict economic forecasts of how the state of the economy will fare over a period of time.
Inputs: socioeconomic indicators, time-series data of indicators (the period of time that the government wishes to forecast)
The police make extensive use of ANNs when dealing with fraud. It is very difficult for humans to identify fraudulent data unless reported. ANNs allow active reporting of possible fraudulent data, causing fraud related crimes to decline. ANNs can also automatically decline fraudulent transactions when they occur.
Inputs: transaction parameters, user's information, and other similar related incidents.
Neural networks can provide valuable aid to doctors by assisting with the analysis of patients and diagnosing their symptoms.
Inputs: Patient's medical history, patterns of symptoms, current medical health (monitored by equipment), and the analysis of laboratory results (blood and urinary tests, etc.). The network can also monitor for new problems that arise during surgery.
An ANN can be used to determine how a plant operates. This is done by utilising complex algorithms to obtain the desired outputs.
As with all artificial neural networks, pattern recognition is a powerful tool. What this means is that an ANN is perfectly suited to conduct quality control on goods and industrial machinery. The network can recognise discrepancies in the minute differences between parts caused by faulty machinery or material.
Inputs: product characteristics, quality factors.
Employee selection and hiring:
Due to the ANN's unique pattern recognition ability, the network is able to produce reliable predictions on which potential employee will achieve the best job performance if employed by a company.
Inputs: Background information provided in character references and resume.
These are only some of the many applications of ANNs available to industry.
The most useful advantage of utilising ANNs is their capacity to learn over time, in any environment. This element of learning assists with the complexities of the real world, where the implementation of other processes are extremely difficult, if not impossible. The use of neural networks shines when they are tasked with 'classification, function approximation, data processing, filtering, clustering, compression, robotics, regulations, decision making, etc.' . The degree, to which an ANN is effective, is dependent on the complexity of the system in use. What this means is that a programmer would not use a linear input-output ANN to solve a complex non-linear problem, they would instead utilise a multilayered approach to obtain the correct outputs. By either simplifying or over complicating the system too far the ANN will encounter problems in its learning processes and achieve poor results. A solution to this is to optimise the system by using a method such as pruning or by utilising the correct training algorithms. However when the artificial neural network is operating correctly to specifications, we have one of the most effective and adaptive simulators available in this age.
 Technopedia (2010). "Artificial Neural Network (ANN)." Retrieved 7/04, 2013, from http://www.techopedia.com/definition/5967/artificial-neural-network-ann.
 Klerfors, D. (1998, 1998). "Artificial Neural Networks
What are they? How do they work? In what areas are they used?" Artificial Neural Networks. Retrieved 7/4/2013, 2013, from http://osp.mans.edu.eg/rehan/ann/Artificial%20Neural%20Networks.htm.
 Sarle, W. S. (2002). comp.ai.neural-nets FAQ, SAS Institute Inc., Cary, NC, USA. 1-7: 1-7.
 doug (2010). How to choose the number of hidden layers and nodes in a feedforward neural network? How to choose the number of hidden layers and nodes in a feedforward neural network? Online, Cross Validated. 8/4/2013.
 Andrej Krenker, Janez BesÌŒter and Andrej Kos (2011). Introduction to the Artificial Neural Networks, Artificial
Neural Networks - Methodological Advances and Biomedical Applications, Prof. Kenji Suzuki (Ed.), ISBN: 978-953-307-243-2, InTech, Available from: http://www.intechopen.com/books/artificial-neural-networksmethodological-advances-and-biomedical-applications/introduction-to-the-artificial-neural-networks
 Alyuda (2013, 2013). "Products and Solutions." Retrieved 3/05, 2013, from http://www.alyuda.com/products/forecaster/neural-network-applications.htm.