Using Matlab To Develop Artificial Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

2.1 Neural network

A neural network is a massively parallel distributed processor made up of simple processing units that have a natural tendency for storing experiential knowledge and making it available for us. Artificial neural network (ANN) is a type of Artificial Intelligence technique that mimics the behavior of the human brain (Haykin, 2009).

ANNs have the ability to model linear and non-linear systems without the need to make assumptions implicitly as in most traditional statistical approaches. They have been applied in various aspects of science and engineering (Rivard & Zmeureanu, 2005; Chantasut et al., 2005).

ANNs can be grouped into two major categories: feed-forward and feedback (recurrent)

networks. In the former network, no loops are formed by the network connections, while

one or more loops may exist in the latter. The most commonly used family of feed-forward

networks is a layered network in which neurons are organized into layers with connections

strictly in one direction from one layer to another (Jain et al., 1996).

2.2 Multilayer preceptor (MLP)

MLPs are the most common type of feed-forward networks. Fig. 1 shows an MLP which has

three types of layers: an input layer, an output layer and a hidden layer.

Neurons in input layer only act as buffers for distributing the input signals xi (i=1, 2 …n) to

neurons in the hidden layer. Each neuron j (Fig. 2) in the hidden layer sums up its input

signals xi after weighting them with the strengths of the respective connections wji from the

input layer and computes its output yj as a function f of the sum.

f can be a simple threshold function or a sigmoidal, hyperbolic tangent or radial basis


Fig. 1. A Multi-layered perceptron (MLP) network

The output of neurons in the output layer is computed similarly. The backpropagation

algorithm, a gradient descent algorithm, is the most commonly adopted MLP training

algorithm. It gives the change Δwji the weight of a connection between neurons iand j as


Fig. 2. Detail of the perceptron process

where η is a parameter called the learning rate and δj is a factor depending on whether

neuron j is an input neuron or a hidden neuron. For output neurons,

δj = (∂f/∂netj)(yj(t) − yj) (3)

and for hidden neurons

δj = (∂f/∂netj)(Σqwjqδq) (4)

In Eq. (3), netj is the total weighted sum of input signals to neurons j and yj(t) is the target

output for neuron j.

As there are no target outputs for hidden neurons, in Eq. (4), the difference between the

target and actual output of a hidden neurons j is replaced by the weighted sum of the δq

terms already obtained for neurons q connected to the output of j.

The process begins with the output layer, the δ term is computed for neurons in all layers

and weight updates determined for all connections, iteratively. The weight updating process

can happen after the presentation of each training pattern (pattern-based training) or after

Engineering Education and Research Using MATLAB


the presentation of the whole set of training patterns (batch training). Training epoch is

completed when all training patterns have been presented once to the MLP.

A commonly adopted method to speed up the training is to add a "momentum" term to Eq.

(5) which effectively lets the previous weight change influence the new weight change:

Δwij (I + 1) = η δj xi + μ Δwij(I) (5)

where Δwij (I + 1) and Δwij (I) are weight changes in epochs (I + 1) and (I), respectively, and μ

is "momentum" coefficient (Jayawardena & Fernando, 1998).

Automatically Improve Software Architecture Models

for Performance, Reliability, and Cost

Using Evolutionary Algorithms


tecture models with respect to performance, reliability, and


ANNS are learning systems that have solved a large

amount of complex problems related to different areas

(classification, clustering, regression, etc.) [1]. The interesting

characteristics of this powerful technique have induced its use

by researchers in different environments [2].

Nevertheless, the use of ANNs has some problems, mainly

related to their development process. This process can be

divided into two parts: architecture development and training

and validation. As the network architecture is problemdependant,

the design process of this architecture used to be

manually performed, meaning that the expert had to design

different architectures and train them until he finds the one

that achieves best results after the training process. The

manual nature of this process determines its slow performance

although the recent use of ANNs development techniques

have contributed to achieve a more automatic procedure.

----> As a general rule, the field of ANN generation using

evolutionary algorithms is divided into three main fields:

evolution of weights, architectures and learning rules.

First, the weight evolution starts from an ANN with an

already determined topology. In this case, the problem to be

solved is the training of the connection weights, attempting to

minimize the network failure.

(Artificial Neural Network Development by

means of Genetic Programming with Graph



The development of Artificial Neural Networks

(ANNs) is usually a slow process in which the human expert has to

test several architectures until he finds the one that achieves best

results to solve a certain problem

((Artificial Neural Network Development by

means of Genetic Programming with Graph





the parametric representations

represent the network as a group of parameters such as

number of hidden layers, number of nodes for each layer,

number of connections between two layers, etc [13]. Although

the parametric representation can reduce the length of the

chromosome, the evolutionary algorithm performs the search

within a restricted area in the search space containing all the

possible architectures. Another non direct representation type

is based on grammatical rules [11]. In this system, the network

is represented by a group of rules, with the shape of

production rules which develop a matrix that represents the

network, which has several restrictions.




5.1.1 Artificial Neural Network

The ANNs were designed with a multiple layer topology K1 Ã- K2 Ã- · · · Ã- KM,

where Ki represents the number of cells in layer i, as shown in Figure 5.1(a).

Each ANN Neuron in layer i, NEURON{i,j} (i ∈ [2;M] and j ∈ [1;KM]), receives

Ki−1 inputs, corresponding to all the outputs of layer i − 1, and performs

the computation shown in Figure 5.1(b). Firstly, it weights all the inputs by multiplying

them with a constant coefficient Ci,j . Afterwards, it computes the sum

of all products and applies a sigmoid function, f(·), to the sum. The description

of all generated ANNs considers an optimised DFG using basic components. In

this sense, there are only three types of operations: 2-value multiplications, 2-value

additions and sigmoids. For performing the sum of products, a tree of adders is

built such as to minimise the DFG critical path. The sigmoids are considered to be

computational expensive operations and thus are implemented by means of look-up

tables. The ANN has one input, x[n], and KM outputs, y1[n] · · · yKM[n]. However,

it is considered to have time information: its output value depends on the latest L

inputs. Thus, the inputs to the first layer are in fact:

x[n], · · · , x[n − L] (5.1)

The ANN benchmarks do not include any of the typical learning structures, i.e.

the computational blocks which serve to adapt the ANN weights C{i,j}. It is assumed

Figure 5.1: The structure of the Artificial Neural Networks.

that the intention is to implement the ANN after the learning stage.


The nodes used to build ANNs with this system are the


• ANN. Node that defines the network. It appears only at

the root of the tree. It has the same number of descendants as

the network expected outputs, each of them a neuron.

• n-Neuron. Node that identifies a neuron with n inputs.

This node will have 2*n descendants. The first n descendants

will be other neurons, either input or hidden ones. The second

n descendants will be arithmetical sub-trees. These sub-trees

represent real values. These values correspond to values of the

respective connection weights of the input neurons - the first

descendants - of this neuron.

• n-Input neuron. Nodes that define an input neuron which

receives its activation value from the input variable n. These

nodes will not have any children.

• Finally, the arithmetic operator set {+,-,*,%}, where %

designs the operation of protected division (returns 1 as a

result if the divisor is 0). They will generate the values of

connection weights (sub-trees of the n-Neuron nodes). These

nodes perform operations among constants in order to obtain

new values. As real values are also needed for such

operations, they have to be introduced by means of the

addition of random constants to the terminal set in the range [-

4, 4]. (Artificial Neural Network Development by

means of Genetic Programming with Graph


The generalization problem

The topology of a network, that is, the number of

nodes and the location and the number of connections

among them, has a significant impact in the

performance of the network and its generalization

skills. The connections density in a neural network

determines its ability to store information. If a network

doesn't have enough connections among

nodes, the training algorithm may never converge;

the neural network will not be able to approximate

the function. On the other hand, overfitting can happen

in a densely connected network. Overfitting is a

problem of statistical models where too many parameters

are presented. This is a bad situation because

instead of learning how to approximate the

function presented in the data, the network could

simply memorize every training example. The noise

in the training data is then memorized as part of the

function, often destroying the skills of the network to


Having good generalization as a goal, it is very difficult

to realize the best moment to stop the training if

we are looking only at the training learning curve. In

particular, like we mention previously, it is possible

that the network ends up overfitting the training data

if the training session is not stopped at the right


We can identify the beginning of overfitting by using

crossed validation: the training examples are split

into an training subset and a validation subset. The

training subset is used to train the network in the

usual way, except for a little modification: the training

session is periodically stopped (every a certain

number of epochs), and the network is evaluated

with the validation set after each training period.

The figure 2 shows the conceptualized forms of two

learning curves, one belonging to measures over

the training subset and the other over the validation

subset. Usually, the model doesn't work so well on

the validation subset as it does on the training subset,

the design of which the model was based on.

The estimation learning curve decreases monotonously

to a minimum for a growing number of epochs

in the usual way. In contrast, the validation

learning curve decreases to a minimum, then it begins

to increase while the training continues.

When we look at the estimation learning curve it

seems that we could improve if we go beyond the

minimum point on the validation learning curve. In

fact, what the network is learning beyond that point

is essentially noise contained in the training set. The

early stopping heuristic suggests that the minimum

point on the validation learning curve should be

used as an approach to stop the training session.

Figure 2 Representation of the early stopping heuristic

based on crossed validation.

The question that arises here is how many times we

should let the training subset not improve over the

validation subset, before stopping the training ses3

sion. We define an early-stopping parameter b to

represent this number of training epochs.

Automatic Generation of Neural Networks

based on Genetic Algorithms


This work is focused in the development of methods

for the evolutionary design of architectures for artificial

neural networks. Neural networks are usually

seen as a method to implement complex non-linear

mappings (functions) using simple elementary units

interrelated through connections with adaptive

weights [10, 11]. We focus in optimizing the structure

of connectivity for these networks.

(Automatic Generation of Neural Networks

based on Genetic Algorithms)


Using MATLAB to Develop Artificial

Neural Network Models for Predicting

Global Solar Radiation in Al Ain City - UAE

4. Designing and programming ANN models

4.1 Designing ANN models

Designing ANN models follows a number of systemic procedures. In general, there are five

basics steps: (1) collecting data, (2) preprocessing data, (3) building the network, (4) train,

and (5) test performance of model as shown in Fig 6.

Fig. 6. Basic flow for designing artificial neural network model

4.1.1 Data collection

Collecting and preparing sample data is the first step in designing ANN models. As it is

outlined in section 3, measurement data of maximum temperature (°C), mean wind

speed(knot), sunshine (hours), mean relative humidity(%) and solar radiation (kWh/m2) for

Al Ain city for 13-year period from 1995 to 2007 was collected through the NCMS.

4.1.2 Data pre-processing

After data collection, three data preprocessing procedures are conducted to train the ANNs

more efficiently. These procedures are: (1) solve the problem of missing data, (2) normalize

data and (3) randomize data. The missing data are replaced by the average of neighboring

values during the same week. Normalization procedure before presenting the input data to

the network is generally a good practice, since mixing variables with large magnitudes and

small magnitudes will confuse the learning algorithm on the importance of each variable

and may force it to finally reject the variable with the smaller magnitude (Tymvios et al.,


4.1.3 Building the network

At this stage, the designer specifies the number of hidden layers, neurons in each layer,

transfer function in each layer, training function, weight/bias learning function, and

performance function. In this work, multilayer perceptron (MLP) and radial basis function

(RBF) networks are used.

4.1.4 Training the network

During the training process, the weights are adjusted in order to make the actual outputs

(predicated) close to the target (measured) outputs of the network. In this study, 10-year

data period from 1995 to 2004 are used for training. As it is outlined in section 3, fourteen

different types of training algorithms are investigated for developing the MLP network.

MATLAB provides built-in transfer functions which are used in this study; linear (purelin),

Hyperbolic Tangent Sigmoid (logsig) and Logistic Sigmoid (tansig). The graphical

illustration and mathematical form of such functions are shown in Table 2.

4.1.5 Testing the network

The next step is to test the performance of the developed model. At this stage unseen data

are exposed to the model. For the case study of Al-Ain city, weather data between 2005 and

2007 have been used for testing the ANN models.

In order to evaluate the performance of the developed ANN models quantitatively and

verify whether there is any underlying trend in performance of ANN models, statistical

analysis involving the coefficient of determination (R2), the root mean square error (RMSE),

and the mean bias error (MBE) were conducted. RMSE provides information on the short

term performance which is a measure of the variation of predicated values around the

measured data. The lower the RMSE, the more accurate is the estimation. MBE is an

indication of the average deviation of the predicted values from the corresponding

measured data and can provide information on long term performance of the models; the

lower MBE the better is the long term model prediction. A positive MBE value indicates the

amount of overestimation in the predicated GSR and vice versa. The expressions for the

aforementioned statistical parameters are:

where Ip,I denotes the predicted GSR on horizontal surface in kWh/m2, Ii denotes the

measured GSR on horizontal surface in kWh/m2, and n denotes the number of observations.

4.2 Programming the neural network model

MATLAB is a numerical computing environment and also a programming language.

It allows easy matrix manipulation, plotting of functions and data, implementation

of algorithms, creating user interfaces and interfacing with programs in other languages.

The Neural Network Toolbox contains the MATLAB tools for designing, implementing,

visualizing and simulating neural networks. It also provides comprehensive support

for many proven network paradigms, as well as graphical user interfaces (GUIs) that

enable the user to design and manage neural networks in a very simple


// more explaination in In-Tech using….

Table 2. MATLAB built-in transfer functions