Structural Design For An Artificial Neural Network Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The structural design of Artificial Neural Networks (ANNs) greatly influences the success of the training process. A network that is too small will not be capable of learning the problem effectively, whereas a network that is too large will over fit and show low generalization performance. When one aims for a simple ANN architecture with a good generalization performance, then designing the neural network structure becomes a challenging task. Designing such a neural network manually is not easy since the classifier complexity, good generalization and better performance are conflicting goals.

Radial Basis Function (RBF) networks are typical ANNs, and these are introduced into the neural network literature by (Broomhead and Lowe, 1988) as a motivation for observing local response in biological neurons. RBF networks have a number of advantages over other types of ANNs and have been widely applied in many science and engineering fields, and these include better approximation capabilities, simpler network structures and faster learning algorithms. A key advantage of RBF networks from the perspective of practitioners is the clear and understandable interpretation of the functionality of basis functions and combine number of different concepts from approximation theory, clustering, and neural network theory (Haykin, 1994; Poggio and Girosi, 1989; Bishop, 1995). It is even possible to extract rules from RBF networks for deployment in an expert system. Most of ANNs design aims for high classification accuracy and low network structure complexity. It is known that simultaneous optimization of accuracy and complexity improves generalization and avoidance of data overtraining. Therefore, choosing the appropriate optimization algorithms for optimal solutions of ANN, such as RBF network, is really crucial in real applications which involve many solutions or objectives.

However, many real-world optimization problems involve several conflicting objectives. Instead of dealing with a single optimal solution, a set of optimal solutions (called Pareto optimal set) exists for such problems. Each of the Pareto optimal solutions represents a different tradeoff between the objectives and in absence of preference information, none of them can be said to be better than others. Pareto optimal solutions are used to evolve ANNs which are optimal both with respect to classification accuracy and structure complexity.

The Pareto-based approach for machine learning has more advantages over the traditional learning algorithms especially in the following features: First, the performance of learning algorithms can be improved, probably because of the new error surface introduced by multi-objective optimization (Abbas, 2003). Second, it is possible to generate simultaneously multiple learning models that account for different learning goals, e.g., accuracy and complexity (Igel, 2005; Jin et al., 2004), multiple error measures (Fieldsend and Singh, 2005), interpretability and accuracy (Jin et al., 2005). Third, the multiple learning models produced using multi-objective optimization are well suited for constructing learning ensembles (Abbas, 2003; Charandra and Yoa, 2004; Jin et al., 2004). And finally, more information can be gained by analyzing the Pareto front obtained in multi-objective machine learning.

A multi-objective optimization algorithm for the learning problem is applied to improve the generalization of the training and unseen data. This algorithm aims at finding a set of solutions, called Pareto-optimal set, from which the best one is selected. Evolutionary algorithms (EAs) are well used for architecture optimization of the RBF networks. However, the existing approaches suffer from high run-time and a convergence in local minima. EAs are good candidates to multi-objective optimization problems because of their abilities to search simultaneously for multiple Pareto optimal solutions and perform better global search of the search space. These EAs are population-based algorithms which allow for simultaneously exploration of different parts in the Pareto front.

In this thesis, RBF network learning, based on MOPSO, MOGA and MODE approaches are presented. Background of the problem, study objectives, importance of the study, and scope of the research are concisely presented in this chapter.

1.2 Background of the Problem

In RBF networks, different layers perform different tasks. Therefore, it is useful to separate the optimization of the hidden unit and output layer of the network by using different techniques. Parameters of RBF networks are the center, the influence field of the radial function and the output weights. Thus, a two-step learning strategies are taken to train them respectively. The first step is called unsupervised learning, and it is used to determine the centers and widths of the RBF network (structure identification stage) by implementing different algorithms such as k-mean clustering and the nearest neighbor's algorithms. The second step is called supervised learning, which is used to determine the weights between the hidden layer and the output layer (parameters estimation stage) by using different algorithms such as least mean squares algorithm and gradient based methods. These approaches are time consuming since it requires evaluation of many different structures based on trial and error procedure.

It is desirable to combine the structure identification with parameters estimation as a whole optimization problem. However, this problem cannot be solved easily by the standard optimization methods. An interesting alternative for solving this complicated problem can be offered by using EAs and Swarm Intelligent (SI) strategies. Genetic algorithms (GAs), the typical representative among others, have been used for the selection of the optimal structure of RBF networks. But GA has some shortcomings such as more predefined parameters, more intensive programming burden and others (Ding et al., 2005).

There are many studies in the literature associated with the RBF network learning. Venkatesan et al. (2006) had used RBF network for pattern recognition and classification for diagnosis of diabetes mellitus and the results were compared with MLP network and logistic regression. Based on their results, it was proven that RBF network has a better performance than other models. On the other hand, Zhang et al. (2004) had applied two real problems in biomedical domain which were breast cancer and gene to RBF network with GAP algorithm called Growing and Pruning Radial Basis Function Network (GAP-RBFN). The results showed that GAP-RBF can achieve a better or at least a similar generalization performance with a much more compact structure and a higher training speed compared with other ANN methods. The application RBF network for time series forecasting has been done by Huang et al. (2003). He used a divide-and-conquer learning approach for RBF network (DCRBF), which was a hybrid system consisting of several sub-RBF networks. The results showed that the proposed approach had faster learning speed with slightly better generalization ability.

Among the meta-heuristic techniques, until recently Particle Swarm Optimization (PSO), Genetic algorithms (GAs) and Differential Evolution (DE) were applied only to single objective optimization task. However, the high speed of convergence of the PSO algorithm attracted researchers to develop multi-objective optimization algorithms using PSO (Kennedy and Eberhart, 2001).

In order to apply the PSO, GA and DE strategies for solving multi-objective optimization problems, the algorithms of these tools need to be modified. It is known that the solution set of a problem with multiple objectives in Multi-Objective Optimization (MOO) does not consist of a single solution (as in global optimization). Instead, MOO aims at finding a set of different solutions (the so-called Pareto optimal set). In general, when solving a multi-objective problem, the main goals to achieve are three (Zitzler et al., 2000):

Maximize the number of elements of the Pareto optimal set found.

Minimize the distance of the Pareto front produced by MOO algorithm with respect to the true (global) Pareto front (assuming we know its location).

Maximize the spread of solutions found, so that it can have a distribution of vectors as smooth and uniform as possible.

Recently, many MOO methods have been proposed in the literature for RBF network learning. Kokshenev et al. (2008) had applied multi-objective (MOBJ) optimization algorithm to the problem of inductive supervised learning based on smoothness of apparent (effective) complexity measure for RBF networks. However, the computational complexity of the proposed algorithm is high in comparison with other state-of-the-art machine learning methods. Kondo et al. (2007) had proposed multi-objective evolutionary algorithm for obtaining Pareto optimal RBF network set. RBF networks are widely used as a model structure for nonlinear systems. The determination of its structure that is the number of basic functions, and the trade-off between model complexity and accuracy exists in this problem.

A multi-objective genetic algorithm-based design procedure for the RBF network has been proposed by Yen (2006). A Hierarchical Rank Density Genetic Algorithm (HRDGA) has been developed to evolve both the neural network's topology and its parameters simultaneously. Hatanaka et al. (2006) had suggested evolutionary multi-objective optimization approach to RBF networks structure determination and its application to nonlinear system identification. The candidates of RBF network structure are encoded into the chromosomes in GAs and they evolve toward the Pareto optimal front defined by the several objective functions with model accuracy and complexity. Then, an ensemble of networks is constructed by using the Pareto optimal networks. Numerical simulation results indicate that the ensemble network is much more robust for the case of existence of outliers or lack of data, than the one selected based on information criteria.

Kondo et al. (2006) had proposed RBF network ensemble which is constructed from Pareto-optimal set obtained by multi-objective evolutionary computation. Pareto-optimal set of RBF networks was based on three criteria; model complexity, representation ability and model smoothness. This method is applied to the pattern classification problem. Experiments on the benchmark problem showed that the proposed method has comparable generalization ability to conventional ensemble methods. Another study by Lefort et al. (2006) had applied the RBF-Gene algorithm to optimize RBF networks. Unlike other works, this algorithm can evolve both from the structure and the numerical parameters of the network. In fact, it can evolve the number of neurons and their weights. Gonzalez et al. (2001) had presented a problem of optimizing RBF network from training examples as a multi-objective problem and proposed an evolutionary algorithm to solve it. This algorithm incorporates mutation operators to guide the search towards good solutions. Results have shown that the proposed method has found very good network for prediction of the Mackey-Glass time series.

The use of Multi-Objective Evolutionary Algorithms (MOEAs) which are Multi-Objective PSO (MOPSO), Multi-Objective GA (MOGA) and Multi-Objective DE (MODE) in RBF network learning seems to be worthwhile for several reasons:

MOEAs are based on non-dominance and Pareto optimally theory to guide the search toward the true Pareto RBF network set (non-dominated solutions) and to generate a well-distributed Pareto front.

The algorithms are computationally easy and efficient with faster convergence rates.

Many solutions of Pareto optimal set are possible within a single simulation run.

1.3 Problem Statement

RBF network has certain advantages over other types of ANNs, such as better approximation capabilities, simpler network structures and faster learning algorithms. However, the construction of a quality RBF network for generalization error and classification accuracy can be a time-consuming process as the modeller must select both a suitable set of inputs and a suitable RBF network structure. On the other hand, there are approaches in which the network structure and its parameters are estimated by the evolutionary computation.

However, more work is still required to develop new model of hybrid learning (unsupervised and supervised learning) of RBF network with MOEAs which are MOPSO, MOGA and MODE. The proposed model aims to balance between numbers of the hidden nodes, the error of training data and the norm of network weights so that over fitting is avoided.

1.4 The Research Question

The main research question is:

Is MOEAs which include; MOPSO, MOGA and MODE beneficial for evolving RBF network learning?

Thus, the following issues need to be addressed in order to answer the main research question stated above:

Could MOEAs be capable of optimizing RBF network complexity (number of hidden nodes or/and norm of weights) as well as error function (Goh et al. 2008)?

Would the classification accuracy increase when the proposed hybrid RBFN- MOPSO or proposed RBFN-MOGA or proposed RBFN-MODE is implemented?

Could MOEAs improve the RBF network generalization error on unseen data and able to generate better classification performance in classification problems?

1.5 Objectives of the Research

In order to achieve the answers to the above questions, the objectives of this study have been identified as:

To develop RBF network learning with MOPSO, MOGA, and MODE.

To optimize the structure of RBF network (number of hidden nodes or/and norm of weights) and error function so that accuracy is balanced and well generalized.

To validate the efficiency of the proposed methods.

To compare the results between RBFN-MOPSO, RBFN-MOGA and RBFN- MODE with previous related studies.

1.6 Scope of the Study

To achieve the above objectives, the scope of this study is bounded to the following:

Eight data sets on binary and multi classification problems; breast cancer, pima Indians diabetes, heart, hepatitis, liver, iris, wine and yeast, which are selected from machine learning benchmark repository will be used in this study.

Focus will be on Multi-Objective Optimization and the proposed methods include MOPSO, MOGA and MODE for RBF network training, testing and validation in classification problems.

The comparisons criteria are convergence towards Pareto front, diversity, and structure of network (number of hidden nodes), sensitivity, specificity, correct classification accuracy and area under receiver operating curve.

In this study, RBF network complexity has been defined as

Number of hidden nodes (RBFs) in hidden layer.

Norm of connections (centers and weights)

And RBF network accuracy as

Mean Square Error (MSE)

Classification Accuracy

The programs are customized, developed and applied to RBF network using Microsoft Visual C++ 6.0 and Matlab 7.0.

1.7 Importance of the Study

The study investigates the capabilities of multi-objective evolutionary algorithms used in RBF network to perform in pattern classification tasks. The performance of MOPSO, MOGA and MODE trained using various measures such as number of hidden nodes, mean squared error, correct classification accuracy, sensitivity, specificity and area under receiver operating curve are evaluated and compared. The performance of MOEAs, which are MOPSO, MOGA and MODE, is evaluated to examine whether this new proposed methods are able to give better performance in terms of mean squared error and classification accuracy.

1.8 Thesis Outline

This thesis consists of six major parts, excluding the introductory chapter. While the first two parts describe the background as well as the previously published work in the field of RBF network learning and MOEAs, the third part describes the research methodology for the work in this thesis. Finally, the last two parts present the algorithmic details of MOEAs based RBF network learning.

Chapter 2, Review on Multi-Objective Evolutionary Algorithms. This chapter begins with an overview of evolutionary algorithms for multi-objective optimization. Here, several studies in the literature regarding the modification of PSO, GA and DE for handling multi-objective optimization problems are presented. In addition, the main algorithms for MOPSO, MOGA and MODE for handling multi-objective problems are also presented. This chapter is concluded by a summary.

Chapter 3, Review on Radial Basis Function Network Design. Hybrid learning algorithms of RBF networks are detailed in this chapter. Broad overview about the basic concepts and traditional techniques of multi-objective optimization are given. Furthermore, applicability of evolutionary multi-objective optimization to ANN learning and especially RBF network learning were discussed. The chapter was concluded with a summary.

Chapter 4 comprises 'Research Methodology'. Indeed, the chapter describes the overall solving-tools and techniques adopted in this research. It also displays a general picture about each phase of the work. Eventually, the chapter portrays the general research framework.

Chapter 5 consists of 'Hybrid learning of RBF network based on MOEAs'. It describes the algorithms of MOPSO, MOGA and MODE for RBF network learning. This chapter correlates different techniques by combining different results produced by different MOEAs based RBF network learning. In fact, results are concisely reported and fully discussed in this chapter.

Chapter 6, Conclusion; this is the last chapter, which discusses and concludes the entire thesis. In addition, this chapter highlights the contributions and findings of this work, and it provides suggestions and recommendations for future research.

1.9 Summary

This chapter serves as an introduction to the entire research work. Thus, background problem, study objectives, research scope and its significance to the bulk of knowledge as well as thesis organization are all presented in this chapter. In particular, the chapter presented a broad overview of the problems involved in the hybrid learning of RBF network which is the subject of this thesis.