Genetic Algorithms (GAs) have been widely used to solve many complex problems successfully. Though, when the search space gets larger, GAs tend to be slow and resource demanding. Fortunately, the implicit parallelism nature of GAs' operations has been exploited to speed-up processing by distributing sub-tasks over many processors simultaneously. This paper reviews and summarizes the different Parallel GAs' (PGAs') approaches' characteristics, differences, and applications.
Keywords: Parallel Genetic Algorithms, Master-Slave Model, Island Model, Coarse-Grained Model, Fine-Grained Model, Parallel Computing.
Genetic Algorithms (GAs) are stochastic optimization search algorithms which have been widely used to solve NP-Hard problems successfully. It is notable that despite the success of GAs, when problems get more and more complex, the optimum solution takes more and more time to be found. This makes GAs impractical in some of the more complex situations [1, 2].
To overcome this weakness, and due to the fact that GAs' operations are parallel in nature [1, 3-5], researchers moved toward what is known as Parallel Genetic Algorithms (PGAs) which have shown great success in performance boosting. PGAs have been implemented using many different approaches which were afterwards studied, classified, and formulated to carry PGAs theory toward maturity. The main three classes of PGAs are: coarse-grained, fine-grained and master-slave models [4-6]. In addition to that, the hierarchical model, defined by many researchers as the fourth class, is a hybrid approach in which different models are implemented at different levels in a hierarchical architecture [4, 6].
Get your grade
or your money back
using our Essay Writing Service!
This paper aims at presenting a review of the literature written about PGAs and summarizing the main issues arising in the field, as well as giving some examples of the main practical applications. Amongst the previous literature reviews in this field,  is a well detailed survey that is written in 1997 by E.C Paz. Although it is considered old compared with the age of this new field of knowledge (2 to 3 decades ), it offers a solid base about the different PGAs' models that even the most recent research rarely deviate from.
This paper is organized as follows: Section 2 gives a brief introduction to the operations of the Genetic Algorithms (GAs). Section 3 discusses the different models of the Parallel Genetic Algorithms (PGAs), their characteristics, variants, and how they differ. Section 4 presents some of the most important applications of the PGAs while section 5 is a conclusion.
2. The Genetic Algorithms (GAs)
The rise of the Genetic Algorithms' (GAs') theory goes back to mid 1960s by John Holland, his colleagues and students at the University of Michigan [1, 2]. Some researchers consider Holland's book  (1975) to be the first book which presents the GAs' theory . For interested readers, Goldberg's book  is a very good reference to start, as suggested in , and as can be suggested here. The book discusses GAs from shallow definitions for beginners to the deep details and variants.
GAs are stochastic search algorithms which mimic the natural selection and the natural genetics. The way in which GAs work is to define a population of individuals (chromosomes) where each chromosome represents a possible solution for the problem under consideration. The initial population, which is called the first generation, can be generated randomly where the quality (the fitness) is most likely to be poor, or heuristically; which generates better individuals for the initial population. This depends mainly on the nature of the initial available information [1, 2].
Genetic Algorithms define two main operators that are applied iteratively on the individuals of the population; crossover and mutation. Crossover generates two new chromosomes (children) by merging the information of two randomly chosen chromosomes (not fully randomly, see below). While mutation is done by modifying one of the pieces of information (genes) encoded in one of the chromosomes. When applying crossover and mutation on a population, a new population is generated and replaces the old one; here we say that we have a new generation, which is hopefully better than the previous one. Repeating this process generates new generations with better and better chromosomes till reaching the stopping criterion which can be defined as noticing no enhancements for number of generations or reaching the maximum allowed number of generations [1, 2].
Chromosomes that are chosen for crossover are not chosen in a fully random way, they are usually selected using the roulette wheel technique. Roulette chooses chromosomes randomly with the probability of choosing each one of them is proportional to its fitness value. This makes better chromosomes have higher probabilities to be involved in reproduction (crossover) operations [1-3, 8].
Always on Time
Marked to Standard
Many variants of GAs have been used by modifying the way that crossover and mutation are implemented. Some variants define single crossover point [9, 10] while some others define two or more [5, 9]. Some variants implement mutation by flipping binary bits values [5, 10, 11], while some prefer swapping two genes values (Genes are the small building blocks of chromosomes, and each gene represents a single piece of information). Many other less-common flavours of mutation and crossover can be found in [3, 8].
Many factors should be tuned properly while implementing GAs to maximize the benefit, and these factors are very application dependent. Some of these factors are: the crossover and the mutation probabilities, the chromosome representation, elitism and selection techniques, crossover and mutation techniques, and others. Till this time, trials and errors are the main methodology for picking the best values for these variables . For more details, [1, 11] are good references to consider.
GAs have achieved a great reputation for robustness in finding solutions regardless of some of the real world problems such as noise and incomplete information [1, 5]. They can solve NP-Hard problems, and can find the global optimum while avoiding being trapped in local optima [12, 13].
One major drawback from which GAs suffer is the long execution time and the heavy memory requirements, mainly when problems get larger and larger, with larger chromosomes and more required generations for convergence [5, 12, 14, 15]. But as can be noticed, the operations of GAs have parallelism in nature, which makes producing parallel flavours of it a logical approach [1, 3-5, 11]. Next section discusses these parallel flavours in more details.
3. Parallel Genetic Algorithms (PGAs)
Genetic Algorithms are considered as a great candidate to be parallelized due to its parallel nature [1, 3-5]. This is done by distributing the work among number of processors (nodes) that are connected together using a defined topology. This can be applied using processors clusters or networks of workstations [3-5, 12]. The goal is to gain more speed-ups which allows for larger and more complex problems to be solved in shorter times efficiently.
The first discussion about parallelization of GAs, as mentioned in , could be in 1976 by Bethke. Even though, Parallel Genetic Algorithms (PGAs) didn't begin to flourish before late 1980s in the works of Tanese, Grefenstete, and others [1, 6, 11]. For interested readers, I suggest two works of Erik Cantú Paz in 1997 and 1995 respectively [6, 11]. Although these works are more than one decade old, they summarize PGAs literature in a clear way that is still been followed by the very recent works (This can be shown when discussing the applications later in this paper).
Parallelizing GAs has taken many different approaches. The most sensible classification classifies PGAs into four main classes (models); Coarse-Grained Model (Aka Island Model, Distributed GAs "DGAs", and Multi-Deme Model), Fine-Grained Model, Master-Slave Model, and the Hierarchical Model [4, 6, 9, 11]. Other less-common classification approaches can be found in [1, 5, 15].
Sections 3.1 - 3.4 discuss these four classes consecutively in more details.
3.1. Coarse-Grained Model (The Island Model)
The most common model to parallelize genetic algorithms is the coarse-grained model [6, 11, 12], which is also known as the island model [9, 16] and the multi-deme model . This model suggests dividing the single population into many smaller sub-populations distributed among the nodes where each sub-population evolves independently, and then some information exchange between the nodes happens every period of time [4, 6, 16].
This model defines number of new terms that don't appear in the context of simple GAs. For instance, migration is defined as the action of transferring individuals between islands. The island in turn means the sub-population that evolves independently. Some of the main parameters that should be taken into consideration when designing a coarse-grained PGA are the number of islands, the migration rate (How many individuals should migrate?), the migration frequency (What is the period between every two consecutive migrations?), and the migration topology (Which islands are connected together and how?) [6, 9, 11, 14, 16].
Although a lot of study has been done to understand migration factors and effects, some researchers stated that migration has not been fully understood yet [6, 9]. It has been noticed that smaller sub-populations converge faster than large single panmictic populations, though; the final solution is usually poorer . At this point, the effect of migration appears. Exchanging individuals every number of generations prevents each sub-population from converging to local optima fast, and keeps the variation in the search space longer. This doesn't mean that migration frequency and rate should be increased too much, because this leads to two main drawbacks; higher communications overhead and less local evolution of each island. This shows the importance of determining the best possible migration rate and frequency for any considered problem [6, 11, 13].
This Essay is
a Student's Work
This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.Examples of our work
A lot of variants of this model have been implemented using different migration approaches. Migration can be synchronous; where exchanging individuals happens on certain predefined instances of time, and this is the most common flavour, or asynchronous; where exchanging data happens depending on certain events or conditions [6, 16]. On one hand, synchronization helps in monitoring the algorithm; it makes it easier to determine the best current individual among the entire population or the current state of each island, which is not the case in asynchronous implementations. On the other hand, synchronization has negative impact on the overall utilization where some islands would remain idle waiting other slower islands to catch up, and it adds extra overhead which is the cost of the centralized synchronization. As synchronization contradicts with scalability, it is more likely to use asynchronous migration in larger distributed environments, as stated in .
Figure : The ring topology for the island model Migration topology has been traditionally ignored when DGAs are discussed. One reason is that in most of the cases it is limited to the topology of the available processing clusters. Another reason is the claims that it has negligible effect on the overall performance [6, 16]. Anyway, the most famous topologies in this context are the hypercube, the torus, and the ring [5, 6, 16] (see figure 1). Some new researches have approached this issue and undermined the older claims . They showed that the migration topology has large impact on the quality of the solution, and this impact is larger when the number of islands is larger .
One thing that should be pointed here is that this model modifies the behavior of simple genetic algorithms. This is seen when the evolution happens within smaller populations rather than having the entire population evolving in one pool [6, 9]. Some claims state that although this model converges faster, it produces poorer results than simple GAs [6, 11]. These claims are faced by some other claims that this model is able to produce similar results , or even better .
Island-model was originally introduced to parallelize GAs. Though, it was applied successfully on many other population-based algorithms such as simulated annealing, Tabu search, GRASP, differential evolution, ant colony algorithms and genetic programming. This success opens the scope of building heterogeneous approaches where different population-based algorithms are implemented in different islands to tackle the same problem [12, 16, 17].
3.2. Fine-Grained Model
Figure : Diffusion Model (Fine-Grained Model) with 2D Grid topology Fine-Grained model (Aka "the diffusion model" ) suggests only one population, but the physical structure (topology) limits the interactions between individuals. This makes individuals migrate only to close neighborhoods, but because of the overlap between the neighborhoods, good individuals are able to disseminate across the entire population [6, 9].
Studies of fine-grained model have shown that one of the important parameters in this context is the ratio between the radius of the neighborhood and the radius of the whole grid. Whenever the relative size of the neighborhood increases, the performance degrades. And if the size was large enough, it becomes as the single panmictic population implementation .
Considering the implementation topology, most fine-grained PGAs implementations adopt the 2-D grid to place individuals, because it is the topology for many massively parallel computers [6, 9] (see figure 2). Some other implementations use torus  or a population structure that resembles a ladder with the upper and lower ends tied together .
The small distributed memories and the low communication cost in this model makes it very suitable for implementation in VLSI. For example, in , massively parallel hardware architecture is suggested to implement a diffusion model for PGAs.
3.3. Master-Slave Model
Figure : Master-slave model Master-Slave (also called "global parallelization" ) model defines one single panmictic population owned by a master node which is responsible for applying the genetic operators (crossover and mutation), where chromosomes' evaluation (fitness calculation) is the only distributed task. The master distributes the chromosomes (individuals) among the slaves to evaluate them and return there fitness values back. When the master receives all the evaluation results from the slaves, it uses them to assess the population and apply the next generation's genetic operations accordingly [5, 6, 11]. See figure 3.
Some implementations of the master-slave model suggest distributing the operations of crossover and mutation in addition to the fitness calculation . Many other researchers state that these two operations are so simple which makes the communications overhead comparable with the speed-up gain, especially that fitness calculation is usually the most costly function in GAs [5, 6].
Two flavours of master-slave model have been introduced; a synchronous flavour; in which the master node waits all the slaves to finish calculating fitness values for all of the chromosomes before proceeding [1, 6, 9, 11], and an asynchronous flavour; in which the master proceeds with any available information [6, 9]. Most of the implementations consider the synchronous one, because the asynchronous implementation alters the nature of the simple sequential genetic algorithms, which makes it harder to understand and monitor the algorithm's behavior [1, 6].
Another aspect here is that the chromosomes can be stored in either one shared memory which is accessed by all nodes (master and slaves) when needed, or a master-owned memory where distributing chromosomes among slaves is the master's responsibility [1, 6, 9].
One of the other factors that are taken into consideration when implementing PGAs using the master-slave model is how the chromosomes are distributed among the slaves. It can be distributed with a constant number of chromosomes per slave, which is the simplest way . The other way is to balance the load over the slaves depending on the processing capabilities of each to achieve the best possible resource utilization [1, 6, 9].
A notable point about master-slave model is that it doesn't alter the behavior of the simple sequential GAs (the less common asynchronous flavour is an exception). This makes it possible to apply all the theories and studies around simple GAs on this model, benefitting from the rich literature in this field [5, 6].
3.4. Hierarchical Model
The hierarchical model (also called "hybrid model" ) for PGAs defines two levels of genetic algorithms' populations, with different model for each level. Most of the hierarchical model implementations consider the multi-deme model (island model) for the upper level, and varies between coarse-grained, fine-grained and master slave models for the lower level [4, 6, 9, 11].
Parallel flavours of genetic algorithms have been applied in many applications, especially when considering extending these in which the traditional genetic algorithms have shown success. The literature is full of applications examples, the examples in this section are only some.
One common class of applications on which PGAs are applied is scheduling problems. The common job shop scheduling is tackled using the master-slave model . This problem is one of the most complicated scheduling problems with no efficient algorithm that can guarantee an optimal solution. The real difficulty in finding good solutions is the size of the search space. Consider the 6 job, 6 machine problem; the number of possible solutions is (6!)6 or over 1.3 x 1017 possible solutions . This master-slave implementation is called a "client-server architecture", where the choices were to distribute both crossover and mutation in addition to the evaluation function. The results for different sizes of the problem were the best when using 4 clients and a server. The same problem was also tackled using both coarse-grained and fine-grained models .
Other scheduling problems include the dynamic voltage scaling (DVS) scheduling for distributed embedded systems which was solved in  by coarse-grained multi-deme genetic algorithms. The target of this problem is to dynamically schedule the voltage scales to minimize energy consumption in embedded systems, and according to , this problem is an NP-Hard problem which was successfully solved by PGAs with extra time improvement over the older sequential GAs approaches.
Coarse-grained PGAs were successfully applied on many other applications such as the reactive power optimization dispatch , the management of the energy resource in an environment monitoring sensor networks , distributing computing loads to the processing nodes of MIMD computers , the graph partitioning problem , classification rule extraction in data mining (EDGAR) , DNA fragment assembly , and multiobjective VLSI cell placement . Most of these problems are stated as NP-Hard problems.
Fine-grained PGAs were successfully applied on graph partitioning , prediction of the secondary structure of RNA , and the handwritten character recognition problem (HCR) .
In addition to the job shop scheduling problem which was mentioned above, master-slave model was applied successfully to solve the problem of dynamically schedule heterogeneous tasks to heterogeneous processors in a distributed in environment, which is an NP-Hard problem .
One more interesting application is solved by what is called "dynamic distributed double guided genetic algorithms" (D3G2A) which is based on the coarse-grained PGAs . The problem is common and is known as constraint satisfaction problem (CSP). This is a class of problems as which many real world problems can be formulated, examples include resource allocation, frequency allocation, and vehicle routing. The problem was solved using the D3G2A in  in 2006, while it was solved again by the same authors using distributed particle swarm optimization in  in 2010.
It is not easy to record all the applications in the literature in this quick review. Though, the reported applications are sufficient to make the reader appreciate genetic algorithms as one strong candidate to solve NP-Hard problems without being trapped in local minima, as well as appreciate the performance boost that results from distributing GAs' operations.
5. Conclusions and Future Work
The great success of genetic algorithms (GAs) as a stochastic evolutionary population-based search technique, especially in solving NP-Hard problems while avoiding being trapped in local optima, led to the wide range of research and applications. Though, GAs suffer from long execution times and large memory requirements when problems sizes get larger.
Fortunately, GAs' operations have implicit parallelism in nature, which led to different variants of parallel implementations that aim at boosting the overall performance. The most sensible classification classifies PGAs to: (1) Coarse-grained (Island model); where the population is divided into sub-populations that are distributed among nodes (islands), each island evolves independently while exchanging some individuals every number of generations. (2) Fine-grained model, where one single population is defined, but the physical limits due to the used topology prevents individuals from communicate except with the neighbors, though, the overlap between neighborhoods allows good individuals to disseminate to the whole network. (3) Master-slave model; one population is defined which exists on the master node. The master performs crossover and mutation, and then distributes fitness calculation among the salves. (4) Hierarchical model; two levels of genetic algorithms where each level is implemented using one of the three models mentioned above.
Many variants of these basic classes have been applied on different applications shown in the literature. The main problem that faces researchers is determining the best variant of the best model which fits the application under consideration. In addition to that, the problem of determining the best values of the many factors that affect the performance of the algorithm in terms of speed and quality. The current approach follows trials and errors over large number of experiments, but future work should concentrate on a stronger theory base which helps in better theoretical approximations.