# Applications Of Hybrid Genetic Algorithms In Seismic Tomography Biology Essay

**Published:** **Last Edited:**

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

In earth sciences essentially all inverse problems are nonlinear and involve a large number of unknown parameters. In practice, the application of analytical inversion or optimization techniques may be quite restrictive. Most analytical methods are local in nature and rely on a linearized form of the problem, adopting an iterative procedure using partial derivatives to improve an initial model. The final solution depends on the initial model and is hence prone to entrapment in local misfit minima. In contrast, global techniques that do not rely on partial derivatives are independent of the form of the data misfit criterion and are computationally robust. In such methods, randomly generated models are assessed in terms of their data-fitting quality and the process may be stopped after a certain number of acceptable models is identified or continued until a satisfactory data fit is achieved. A new class of methods known as genetic algorithms (GAs) achieves this approximation through model representation and manipulations. Only recently, the GAs attracted the attention of the earth sciences community with applications in geophysics and seismology.

In this paper, the efficiency of least-squares and genetic methods representing respectively the local and global optimization is presented and analyzed, in order to discuss in detail the efficiency and the limitations of each method in seismic tomography applications. The potential of both optimization methods has been tested and compared using the same experimental geometry, wavelength and geometrical characteristics of the model anomalies on a set of seismic refraction synthetic data modeling.

## 1. Introduction

Seismic traveltime tomography can be successfully used for the velocity model reconstruction across different geological scales varying from crust to upper mantle (Hole et al., 1992; Boschetti et al., 1996; Papazachos and Nolet, 1997; Bijwaard et al., 1998; Weber, 2000; Daley et al., 2004; Lersviriyanantakul et al., 2006; Ajo-Franklin et al., 2006; Vassallo and Zollo, 2008). Seismic tomography is often difficult to be applied due to irregular experimental geometries, signal quality (environmental and anthropogenic), as well as the lack of a priori information of the area under investigation. Moreover, in some geological environments, such as a sedimentary basin, is difficult to collect (picking) a lot of traveltimes due to high level of seismic attenuation, resulting in ray geometries with gaps and shadow areas.

During the tomographic reconstruction problem, a linear system of equations derived from an initial model is often solved. The system is usually ill-posed and inconsistent due to some sources of limited and noisy projection data and ray bending. The linear approach can be considered as an acceptable and computationally-effective solution only when some robust a priori information is available and a good starting model can thus be established. In practice, analytical methods, such as least-squares or conjugate gradient, are local in nature and rely on a linearized form of the problem in question, adopting an iterative procedure using partial derivatives to improve the aforementioned starting (initial) model. Moreover, this calculation of derivatives might be difficult and add further numerical instability because of the approximations adopted.

Since most of the geophysical optimization problems are nonlinear and result in irregular objective functions, the local optimization methods as mentioned above are prone to trapping in local minima, and their success depends heavily on the choice of the initial model. Several algorithms (global optimization methods) tackle this problem by sampling a wider search space in order to detect the global minimum (or maximum) of a given function (e.g. Sen and Stoffa, 1995; Parker, 1999; Sambridge and Mosegaard, 2002; Boschetti and Moresi, 2001). These global techniques do not rely on partial derivatives and are computationally robust by avoiding the Jacobian matrix inversion. Starting with a set of initial solutions, these algorithms progressively modify the solution set by mimicking the evolutionary behavior (genetic algorithms) of biological systems, until an acceptable result is achieved. Genetic algorithms (GAs) have been used to solve seismological (Wu Yih-Min et al., 2008; Nicknam et al., 2010; Nicknam and Eslamian, 2010), geophysical and environmental problems (Cavicchio, 1970; Davis, 1987,1991; Goldberg, 1989; Stoffa and Sen, 1991; Buckles and Petry, 1992; Jin and Madariaga, 1993; Mathias et al., 1993; Billings et al., 1994; Rodriguez-Zuniga et al., 1997; Louis et al., 1999; Sambridge, 1999a,b; Rucker and Ferre, 2005; Schwarzbach et al., 2005; Currenti et al., 2005; Krahenbuhl and Li, 2006; Basokur et al., 2007; Dal Moro et al., 2007; Jha et al., 2008; Dal Moro, 2008; Jishun et al., 2009; Akça and Basokur, 2010). In GAs, a series of "genetic operations" (namely selection, crossover and mutation) acts along various successive steps (generations) in order to improve the fitness of the population (which is a measure of goodness of fit between data and synthetics for the model) in successive generations. The final solution is a model that achieves the best fitness value. Due to their initially random and progressively more deterministic sampling of the function domain, these algorithms offer the possibility of efficiently locating/searching, the most "promising" areas of the solution space. They are able to solve nonlinear, nonlocal optimization problems, without the need of curvature information and consequently without the need for derivative calculations. Moreover, since genetic algorithms are based only on direct space sampling, any kind of linearization of the problem under investigation is avoided, with the consequent elimination of the errors involved in such an approximation. These factors make genetic algorithms particularly attractive for addressing complex real-world problems, and they are being used increasingly to address geophysical problems. However, the large dimensionality involved in most geophysical optimization problems can reduce the efficiency of genetic algorithm search, both in terms of quality of the result and computational cost.

In the present work, the application of hybrid genetic algorithms in seismic tomography is examined and the efficiency of least-squares and genetic methods as representative of the local and global optimization, respectively, is presented and evaluated. The robustness of both optimization methods has been tested and compared for the same source-receiver geometry and characteristics of the model structure (anomalies, etc.). A set of seismic refraction synthetic (noise free) data was used for modeling. Specifically, cross-well, down-hole and typical refraction studies using 24 geophones and 5 shots were used to confirm the potential applicability of the genetic algorithms in seismic traveltime tomography. In order to solve the forward modeling and estimate the traveltimes, revisited ray bending method was used supplemented by an approximate computation of the first Fresnel volume. The root mean square (RMS) error as the misfit function was used and calculated for the entire random velocity model for each generation. After the end of each generation and based on the misfit of the individuals (velocity models), the selection, crossover and mutation (typical process steps of genetic algorithms) operators take place continuing the evolution theory and coding the new generation. To optimize the computation time, since the whole procedure, the MATLAB Distributed Computing Engine (MDCE) was used in a multicore engine. During the tests, the initially fast convergence of the algorithm (typically first 5 generations) is followed by progressively slower improvements of the reconstructed velocity models. Therefore, to improve the final tomographic models, a hybrid genetic algorithm (GA) approach was adopted by combining the GAs with a local optimization method after several generations, on the basis of the convergence of the resulting models. This approach is shown to be efficient, as it directs the solution search towards a model region close to the global minimum.

## 2. Genetic algorithms

Genetic Algorithms (GAs) have been originally introduced by John Holland and his group at the University of Michigan in the 1970s (Holland, 1975). They belong to the class of stochastic search methods such as simulated annealing, ant colony optimization, and taboo search. The fundamental aspect characterizing a genetically based evolutional scheme is the Darwinist paradigm that the fittest survive and reproduce, while others disappear. Among the several appealing features that characterize GAs some are particularly relevant to solve optimization problems. The main advantage of this class of optimizers is that they tend to elude the attraction of local minima and their random-but-driven search schemes try to reach an optimal solution by considering all of the regions of a user-defined search space. In contrast to the common linear methods, they do not require an initial model to start the optimization. The user designs a search space within which the possible solutions are searched and evaluated. Moreover, as with other heuristic methods, GAs can be applied to problems where the function to be optimized exhibits discontinuities, which would exclude the use of derivative based approaches.

An initial population composed by an arbitrarily fixed number of individuals (candidate solutions or, to use the genetic lingo, chromosomes) is randomly generated. At early times, these chromosomes were formed by binary numbers only, but later they are generalized so as to be formed by any other entities such as arrays, lists, trees, mathematical operators, or any kind of number depending on the type of problem. In a problem such as the one considered in this work, we chose the initial values of the physical parameters (velocities) that determine the observed geophysical response that we have measured. Each of these physical parameters becomes a model parameter (chromosome as the set of these numbers such as, x1, x2, . . . , xn) which will be combined with all the other model parameters in order to find a solution that fits best with the observed data.

After that, the forward problem is solved (using ray tracing in our case study) and the fitness is determined according to the discrepancy with respect to a desired characteristic. This fitness value (determined by means of an objective function) is then considered in the successive selection and crossover operations: the fittest individuals (i.e. the ones with the highest fitness values) are selected to generate offspring whose characteristics are partly taken from one parent and partly from the other. Elitism is a strategy often implemented to pass the best individual(s) of each generation unchanged to the next generation in order to avoid possible loss of good individuals. Mutation operators allow good genes (that have never appeared before) to be selected and should also ensure that a potentially good component is not lost during the reproduction and crossover operations (Goldberg, 1989; Haupt and Haupt, 1998; Man et al., 1999). The process can stop after a fixed number of generations or when the fitness of an individual reaches a certain previously-fixed value. Although the main principles are the same, the definitions of genetic operators and their relative importance vary from application to application (Haupt and Haupt, 1998).

A typical pseudocode of a simple application of GA in seismic tomography using MATLAB (MDCE) is presented in appendix A.

## 3. Velocity Reconstruction via GAs

The aim of the hybrid application of GA and local optimization methods presented here is to search the unknown velocities defined at the nodes of a 2D model. A priori information about the study area may be used for the definition of the parameter search spaces. Otherwise, a layered model with increasing velocities with depth can be used as a reference model and limits of the parameter search spaces can be defined proportional to the model velocities. Initial population is generated with the random distribution of model parameters over the search space. Models are coded with a binary scheme to process with genetic operators. The same coding was used for all experiments. Once the coding scheme was selected, all the encoded model parameters are merged into a long bit string (chromosome) which will be modified by the algorithm. Higher parameter resolution requires relatively long parameter strings. The algorithm should then determine the fitness of the individual models by decoding the binary information into the velocity model parameters. Selection, crossover and mutation operators act on individuals of proceeding generations. The number of better fitted models is expected to increase in the population throughout the evolution process.

Selection of individuals for mating is based on tournament selection. Tournament selection is the way of selecting the winner of a tournament among a certain number of randomly chosen members of the population based on the fitness values. The winners of two sequential tournaments become the mates selected for crossover. Once the mates are chosen, a scattered crossover takes place between the genes of two individuals. The crossover probability was fixed as 65% for all experiments in this work. A random number chosen uniformly within the [0 1] range designates which mate contributes for that bit of the offspring (if the value of the random number is greater than 0.65 crossover does not occur else second mate gives the gene). Random mutations with a probability of 0.2% are permitted, if the micropopulation option is not turned on. Micropopulation redistributes the models over the search space (keeping the best fitted model) when population becomes nearly homogenous. The crossover and mutation probabilities were fixed to these values according to the experience of authors for the given problem. However, for another problem, or for different settings of population size and number of parameters a different setting for crossover probability might yield the best result.

The solution of the forward modeling involves the accurate calculation of the first arrivals with respect to initial velocity model. Rays for the first arrivals were calculated in two steps. In the first step, graph theory and the revised algorithm of Dijkstra (Moser, 1991) were used for estimating the traveltime of the wave front at each node of the grid and preliminary ray-paths. These paths were further optimized in the second step by a revised bending algorithm, which has been shown to have superior convergence properties (Moser et al., 1992). After the final raypaths were defined by both methods, the width of the approximate Fresnel volume for each point along each source-receiver raypath was estimated, as suggested by Soupios et al. (2001), Mpogiatzis (2006) and Mpogiatzis et al. (2009).

In order to evaluate and test the efficiency of the proposed algorithm and to compare it with the application of local optimization in the same problems, different basic parameters (number of generations and populations, variability of the model and number of nodes of the model), different initial velocity models (layered or homogeneous model) and several source-receivers configurations in different generated synthetic traveltime data sets were examined.

Figure (1) presents the application of GAs in three different crosswell (CW) or combined CW and vertical seismic profile (VSP) experimental geometries. Synthetic traveltime data were generated from a homogeneous velocity structure with four layers plus a high velocity block in the center of the model as shown in figure (1). The first model consists of 3 sources and 7 receivers resulting in 21 rays and low raypaths coverage in the upper and lower part of the resulted model. The second model consists of 2 sources and 8 receivers resulting in 16 rays and better raypath coverage in the upper and lower part of the resulted model. The last experiment consists of 4 sources and 8 receivers resulting in 32 rays and the best raypath coverage in the upper and lower part of the resulted model in comparison to other models. It is quite evident that the best velocity model can be reconstructed (in shape and amplitude) from the (best raypath coverage) after 45 generations.

Since, in all seismic tomographic problems, regardless the experimental geometry, the high velocity anomalies are reconstructed better than the low velocity anomalies, the application of the proposed algorithm in more complex velocity problems is evaluated.

Figure (2) shows the application of the proposed algorithm in a CW model with homogeneous velocity background and two (a positive and a negative) anomalies in the upper and lower part of the model, respectively. The model on the left (fig. 2a) consists of 51 sources (left side, red filled circles) and 51 receivers (right side, green filled circles), resulting in 2601 raypaths (Soupios, 2000). The second model (fig. 2b) consists of 5 sources and 9 receivers, resulting in 45 raypaths. The idea is to examine whether the proposed algorithm can reconstruct the velocity model from much smaller data set. Figures (2c) and (2d) present the final tomographic velocity models of figures (2a) and (2b), respectively. The tomographic velocity model of fig. (2c) was calculated after the inversion of first traveltime arrivals using a least-squares method and applying appropriate damping and smoothing regularization. Both tomographic models were reconstructed by more than 85% in amplitude and shape. The RMS of the first (fig. 2c) and the second (fig. 2d) velocity models are 2.13% and 2.25%, respectively.

We should mention that GAs managed to reconstruct the ideal velocity structure even the code used 58 times less raypaths (45 raypaths instead of the 2601 raypaths) than the model of fig. (2a). Computationally, the model used in case of fig. (2a) (121 nodes) for 2601 raypaths and using local optimization methods, required an average CPU time of 25 min (on an Opteron 8220 2.8Ghz processor) to solve the forward and inverse problem (fig. 2c). On the contrary, the model used in fig. (2b) (24 nodes) for 45 raypaths and with the application of GAs and after 30 generations required on the same CPU an average time of 2150 min (36 hours) to reconstruct the velocity model as shown in fig. (2d). Based on the tests presented in figures (1, 2) it is obvious that GAs are almost 100 times slower than the local optimization methods for the same configuration.

Generally, GAs provide a very fast initial convergence followed by progressively slower improvements (Boschetti, 1995a,b). Thus, the form of the curve suggests that the algorithm should be stopped when an approximate solution has been found, since further improvements may be much more time consuming. Hence, further improvements to the code can be obtained by combining the Genetic algorithm with a local optimizing method, in order to direct the search to the region close to the global solution.

## 3.1 Hybrid Genetic Algorithms - Combination of Global and Local Optimization

In view of their respective characteristics, the combination of global and local optimization is an effective tool for solving the nonlinear inverse problem. Cary and Chapman (1988) accomplished waveform inversion by the combined use of the Monte-Carlo and gradient methods. Stork and Kusuma (1992) combined the genetic algorithm with waveform-related steepest descent for solving the residual static correction forward problem. Chunduru et al. (1997) systematically described seven hybrid optimization methods combining very fast simulated annealing and conjugate gradient (CG) for 2-D imaging of electrical resistivity profile data and seismic interpretation. Among these techniques, the combined use of global and local optimization exploiting the advantages of each method has created a new trend in inversion research.

We present an efficient and robust method for nonlinear travel-time inversion by combining the least-squares solutions and genetic algorithms. Specifically, we constructed a hybrid genetic algorithm with the following assumptions,

If the best fit is less than 0.5 msec, we have used the fittest velocity model as initial (starting) velocity model for the least-squares inversion of the traveltimes to achieve the final model,

If generations are greater than 5 and the misfit change among generations is less than 0.8 msec (almost constant misfit), the velocity models with misfit less than the average misfit are selected and inserted as initial models for solving the forward and inverse problem. The updated velocity models after inversion are transformed to binary format in order to include these "optimum" velocity models into the original population. The selection, crossover and mutation are then applied for the creation of the next generation.

We should mention that the selection of the optimum parameters for the application of hybrid GA, such as the 0.5 msec and 5 generations and 0.8 msec, used in the first and second convergence assumptions respectively, are empirical and extracted after large number of tests.

A dipping layered model with increasing velocity with depth was used in order to check the applicability and efficiency of the proposed hybrid algorithm. A CW and VSP model was applied with 3 sources and 7 receivers (21 raypaths) (fig. 3C - upper). The model consists of 21 nodes (nx=3, nz=7). The starting model for all tests was a homogeneous (1.0 km/s) velocity model. Different population sizes were used to test the proposed GAs (fig. 3B) and hybrid GAs (fig. 3A, D and E). The final velocity models are compared with the ideal velocity model (fig. 3C - upper) while the final tomographic velocity model after the application of the least-squares method (7 iterations and a misfit of 0.29 msec - RMS: 4.16%) is shown. The misfit curve for all the tests illustrates the reliability of the final resulting velocity models.

Models A and D (fig. 3) were tested using the hybrid algorithm by applying a different population size, 15 and 5 times the number of nodes, respectively. The misfit for both models was about 0.7 msec after 30 generations. These results suggest that the estimated velocity model does not depend on the population size and hence, using fewer parameters (or population size) a reliable velocity model can be rapidly obtained.

Model B, presents the reconstructed velocity model applying GAs and after 70 generations. The final misfit is about 0.8 msec. After 5 generations the inversion is turned on and improves the misfit function as is clearly shown in fig. (3A, D and E). After 15 generations, the user can acquire a fairly reliable velocity model. On the contrary, to get a similar (in quality) velocity model from GAs (model B) more than 30 generations need to be applied.

Model E depicts the application of hybrid algorithm in the usual population size (10 times the nodes, velocity parameters) but at the same time, the micropopulation is used. This module works as follows: when the average and the best misfit are very close, the population is perturbed in order to help the algorithm to better "search" into the model space. As shown in figure (3E), the misfit is about 1.32 msec (higher than the other tests) but the final reconstructed velocity model after 30 generations, is closer to the ideal velocity model (fig. 3C-upper) and the interface between the first and the second layer is much better delineated.

The most important result in figure (3) is that the final tomographic velocity model, after inversion (7 iterations) using a least-squares method (fig. 3C - lower model) is far from the real (ideal) model. This is perhaps due to the inefficiency (poor determined character) of the specific experiment (only 21 raypaths to determine 21 nodes-parameters). The tests presented in figure (3) show that: a) the proposed hybrid algorithm (combination of GA with a least-squares method) is working efficiently and, b) is more efficient and capable to reconstruct a typical velocity model than the least-squares method.

On the basis of the results from the preceding tests, the main disadvantage of the GAs approach in seismic tomography applications is the CPU time required for the estimation of the final velocity model. To overcome this problem, the hybrid algorithm was modified for multi-processing systems using the MDCE module of MATLAB. Specifically, the job manager is defined and the idle workers were used for running the hybrid algorithm (see appendix A) in a pseudo-parallel way, distributing the population to the available workers.

Improving the computing time by using the MDCE module of MATLAB and in order to ensure the reliability of the proposed hybrid algorithm an additional test was performed. A typical refraction experiment configuration was examined using 5 sources (at 0, 50, 175, 300 and 320m) along the ground surface and 24 geophones with 10 meters spacing (g1: 60m â€¦ g24: 290m). In total, 120 raypaths were used to define 45 unknown model parameters (9 by 5 model velocity nodes). A synthetic data set simulating the presence of a concealed body (low velocity anomaly) (figure 4) was adopted.

Figure (4) shows the ideal velocity and geological structure at the top of the figure. The ray coverage is also presented depicting the areas of high resolution. The velocity models applying GAs after several generations (gen: 1, 10, 20, 28) are presented. The final tomographic velocity models after the application of a least-squares method are also shown. The best velocity model acquired after 28 generations, with a misfit of the order of 2 msec (see misfit curve at the upper part of figure 4). The best tomographic velocity model after 13th iterations of the inversion algorithm is also presented, with a misfit of about 0.75msec (RMS=4.08%).

It is clear that the proposed hybrid algorithm produces much better results (the velocity model was well reconstructed in both amplitude and shape) that the least-squares methods. This further supports the idea that the proposed hybrid genetic/least-squares algorithm can really improve the reconstruction of the subsurface applying seismic tomography.

## 4. Conclusions

In this work, a hybrid genetic/least-squares algorithm (combined global and local optimization method) is proposed and implemented in multicore engines, in order to solve large dimensional problems and fast location of global minima, providing robust and reliable tomographic images. It should be also mentioned that, the hybrid GA can better recognize the low velocity areas than traditional least-squares method approaches. The final velocity model obtained by this algorithm is independent of the initial model, while the algorithm requires relatively fewer data in order to achieve convergence. This advantage can be quite important in real world study cases: application to a typical crosshole project for a bridge foundation using more than 12000 traveltimes resulted in similar velocity results by the application of the proposed hybrid algorithm using only 100 traveltimes.

## Acknowledgments

The authors would like to thanks the Seismological Station of the Aristotle University of Thessaloniki, for providing computing time in distributed machines. PS would also like to thank the Greek Ministry of Education for supporting a 6 months visit to Ankara University, as well as Prof. F. Boschetti for providing valuable information, codes and help for the completion of this project.