Genetic Algorithm For Speech Signals Separation Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Blind source separation is an important issue for processing speech and image signals. Blind source separation works based on recovering a set of mixed signal which sources of their mixing are unknown. In this paper, we propose a blind source separation based on continuous genetic algorithm (CGA) and binary genetic algorithm (BGA). The proposed method includes several main steps of preprocessing which are centering, whitening and symmetric orthogonalization. All steps have an important role in solving this problem. Research into the BSS application in speech separation gives a detailed analysis of speech mixtures separation. This paper applies the high order statistics of kurtosis as a simple and main criterion for signal separation. Most of papers have been focused on at most three sources. But we investigate the blind source separation for more than three sources. The speech signals for evaluating this method are selected from TIMIT database and mixed signals of this DB together with Gaussian and white noise. The efficiency of the proposed method for blind source separation is demonstrated by simulations. Simulation results show high accuracy, fast convergence performance and suitable SNR for continuous genetic algorithm. It is shown that CGA considerably would be better to get result than BGA for solving of the blind source separation problems.

Index Terms- Blind source separation, Centering, Whitening, Symmetric orthogonalization, High order statistics, Genetic algorithm


In recent years, blind signal separation (BSS) has been focused by experts due to its high performance in different fields of signal processing. Presently, blind signal separation is used as a main processing method in many applications such as image processing and restoration, biomedical signal processing, speech signal recognition, medical data processing, radar signal communication and sonar [1, 2, 3, 4]. The most classical application of BSS in speech signals processing is Cocktail Party Effect that occurs in an acoustic environment in the presence of background noise or competing

speakers. In BSS, the goal is to retrieve dependent and inaccessible source signals from their mixture without knowing the mixing system. Also, "blind" conceptually refers to characteristics in which source signals cannot be observed and mixing model is unknowable [5]. In order to solve the BSS problem, many ways is considered and various algorithms have been proposed e.g. Independent Component Analysis (ICA), Principle Component Analysis (PCA) and others [6, 7, 8, 9]. The most important and simplest of them is ICA that its purpose is to find components of signals which have the most statistical independence. This algorithm is based on random and natural gradient [10]. Also, other algorithms such as FastICA, Maximum Kurtosis, Infomax and Maximum Likelihood are used for solving BSS so far [11], [12]. Many neural networks have been proposed which their operation depend on an update formula and activation function. Both of them are updated for maximizing the independence between estimated signals [13, 14]. But these algorithms depend on the distribution of source signals. Since this separation is executed blindly and there is no information about source signals, so the distribution function of source signals should be estimated. Consequently, It leads that the accuracy of problem solving to be reduced.

The problem of source separation is to extract independent signals from their linear or nonlinear mixtures. All models of a BSS problem has been widely studied until now [15,16, 17]. One of them is the linear model that principally considers in various fields [18], [19]. Also, BSS is a popular search problem among researchers because of it can work based on evolutionary algorithms such as continuous and binary genetic algorithm, particle swarm optimization (PSO) and so on [20, 21]. It is obvious that GA is a successful evolutionary algorithm. All forms of genetic algorithm provide heuristic solutions for combinatorial optimization problems. Nowadays, these kinds of problems have emerged in many scientific applications as a notable subject.

In this paper, we mainly studies a blind source separation approach of linear mixed signals to get the coefficients of separating matrix by using both forms of genetic algorithm. The genetic algorithm is introduced in [22, 23]. The operation of this algorithm principally depends on the fitness function which uses high order statistics (HOS) [24, 25, 26] of kurtosis. It is specified that kurtosis is a simple and necessary criterion for estimating dependency among signals [27]. By using kurtosis as fitness function in genetic algorithm it does not need to have activation functions like what is required in neural network [28]. The Studies on our simulation demonstrate that the GA-based BSS scheme is robust to achieve global optimal solutions from any initial values of the separation system. The simulation analysis show that the result of BSS based on continuous genetic algorithm is obviously better than that of binary genetic. So the proposed method is particularly effective for this kind of optimization problems.


Assume that there exist unknown speech signal which are as mutually independent as possible. It is supposed that the source speech signals in linear model of BSS are linearly together

With a matrix that is unknown:


Whereand are dimensional source and mixed speech signals that is the number of sources. The goal in solve of BSS problem is to discover the source signals from without knowing the modality of mixing matrix A. For doing this task, separating matrix W should be found that it is in ideal situation. However, we can find W:


So that includes dimensional estimated signals of source signals and accordingly the BSS problem is simply solvable. A general model of BSS problem with sparse representation, which illustrated as Fig. 1 includes three procedures: an unknown mixing model, a recognition of mixing matrix and a source signal retrieval process.


A. Centering

One of the most basic and necessary part of















Fig. 1 The general BSS flowchart consisting of unknown mixing model, recognition of mixing model and source

retrieval operation

preprocessing is to center mixing signals so as to subtract its mean vector m = E{x} that means convert to a zero-mean signal [26]. This step should be executed because kurtosis basically obtains as follows:


The assumption of data centering makes easy the calculation of kurtosis. So, we can compute kurtosis from a simple formula:


After estimating the mixing matrix with centered data, we can complete the estimation by adding the

mean vector of back to the centered estimates of . The mean vector of s is given by, where is the mean that was subtracted in the preprocessing.

B. Whitening

Another useful preprocessing strategy in ICA is to whiten the observed signals [29]. This means that before considering the application of the ICA algorithm (and after centering), we transform the observed signal linearly so that we obtain a new signal which is white, i.e. its components are uncorrelated and their variances are equaled in unity. In other words, the covariance matrix of equals the identity matrix:

The whitening transformation is always feasible. One popular method for whitening is to use the eigenvalue decomposition (EVD) of the covariance matrix where is the orthogonal matrix of eigenvectors of and is the diagonal matrix of its eigenvalues,. Whitening can now be calculated by:


The benefit of whitening is that it works based on the fact that the new mixing matrix is orthogonal. This can be seen from:


This characteristic of separating matrix reduces the number of parameters needs to be estimated. Instead of having to estimate the parameters which are the elements of the separating matrix, we only need to estimate the new orthogonal matrix. An orthogonal matrix contains n(n−1)/2 degrees of freedom in this problem.


The algorithms that work based on evolutionary mechanism can be the best solution for solving BSS problem through finding optimum and accurate coefficients of separating matrix for minimization of dependency. Primary population can be converted into a new population that dependency among its components is maximized using a suitable fitness function. Since GA intrinsically uses evolutionary algorithms, we take advantages of it as a successful and fast algorithm in this paper.

A. Fitness function for GA

A wide variety of criterions such as correlation function, negentropy, entropy, kurtosis and so on exist for measuring independency among signals that among the rest, kurtosis is a very simple and essential measurement. Kurtosis of signals can define as:


According to Central Limit Theorem that is totally practical in ICA the distribution of a sum of independent random variables tends toward a gaussian distribution, under certain conditions [27]. Thus, a sum of two independent random variables usually has a distribution that is closer to gaussian than any of the two original random variables.

In BSS, if the kurtosis of estimated signals is maximized and distanced from the kurtosis of gaussian signal then the reverse of Central Limit Theorem is confirmed and independence among estimated signals is guaranteed. So we can define fitness function of GA as the sum of the absolute values of kurtosis in estimated signals.


Where are estimate of source speech signals. The independence among the estimated signals is minimized when Fitness is maximized and separating has been executed. However, we recover the independent components under the premise that. In this algorithm, it is not necessary to assume that the sources have the same sign of kurtosis, because we can directly maximize the absolute of fitness function. So we can simply separate super gaussian signals from each other, sub gaussian signals from each other and super gaussian and sub Gaussian signals together.

B. Orthogonalization

The Orthogonalization in ICA plays a main and practical rule that we can say the BSS algorithm would be completely defective without it. The estimate of coefficients using maximization of fitness function to retrieve independent components is not enough. With doing these steps until now, outputs of BSS algorithm are similar speech signal that are the estimate of source speech signal that its kurtosis is maximum. It should be mentioned that GA does its task correctly because only when fitness function is maximized that all estimated signal have the analogy and maximum kurtosis. We apply the orthogonalization in order to avoid this problem. We obtain orthogonal separating matrix by orthogonalization and according to whitening preprocessing have Eq.5. Whenever orthogonalization apply in a GA before fitting each population in addition fitness function is maximized also estimated signal have mutually independent. Two main methods for orthogonalliz-ation exist: Deflationary and Symmetric orthogon-alization. Usually Symmetric orthogonalization is used in ICA because of higher applicability and obtains through the following formula:


With doing Symmetric orthogonalization as last necessary step for BSS can practical this algorithm guarantee. The structure of the BSS algorithm based GA is shown in Fig.2.


In order to check the effectiveness of the proposed algorithm, we use Euclidean distance of the two vectors: the kurtosis of the estimated and source signals. What this criterion be less, the results of the separating process is better. Also, we utilize the SNR (signal-to-noise ratio) to confirm the accuracy of Euclidean distance as evaluating criteria. We define SNR as:



Produce initial population from n2 coefficients of separating matrix matrix

Calculate estimated signal from Eq.2

Calculate fitness of each population from Eq.9

Elitism, produce new population and update it

Symmetric orthogonalization

optimum and best solution Get

Obtain separate signals from Eq.2


Meet stopping criterion?



Fig.2 Structure of BSS based GA


In this experiment, speech signals are selected from TIMIT database and are combined together by an unknown mixing matrix with random values in uniform distribution in the range. The population size in continuous and binary genetic algorithm is, crossover probability per chromo-some is and mutation probability per gene is. Also in the simulation by binary genetic algorithm, each chromosome is encoded with eight bit strings.

A. Simulation 1

In this experiment, all of the sources are the speech signals that are super gaussian. The sample length is selected 10000. We randomly choose the mixing matrix as:

Figure 3 represents 10000 samples from the source signals. Figure 4 represents mixed signals. The separate signals with proposed algorithm are shown in Fig. 5. The results are shown in Table1 with calculating kurtosis of sources, mixed and estimated signals by Eq.4.

The Euclidean distance or error in this experiment shows that the blind speech signal separation based on GA practically works with high accuracy. Also, SNR of the three estimated signals is shown, respectively. SNR values show that sources and obtained signals have the greatest relationship and dependency.

Fig. 3 Source speech signals

Fig. 4 Mixed speech signals

Fig. 5 Separate speech signals

Kurtosis of source signals




Kurtosis of mixed signals




Kurtosis of estimated signals




Signal-to-Noise Ratio (SNR)




Euclidean distance =0.0016

B. Simulation 2

In this simulation, we apply the proposed algorithm to sub gaussian and super guassin signals under the same conditions. In Figure 6-8 has been shown the source signals that are two speech signals and a cosine signal, mixed signals where mixture matrix is and separated signals. It is shown in Table 2 that proposed method achieved the successful separation of the speech signals and noise from their linear mixtures. Thus, this method of separating speech signals from white noise and babble noise has high performance and accuracy.

Fig. 6 Two super gaussian signals and a sub gaussian sinusoidal signal

Fig. 7 Mixed signals in Fig.6


Comparison of original, mixed, separated signals and SNR value and separating error as Euclidean distance of simulation 1

Fig. 8 Obtained speech signals and sinusoidal wave


Comparison of original, mixed, separated signals and SNR value and separating error as Euclidean distance of simulation 2

Kurtosis of source signals




Kurtosis of mixed signals




Kurtosis of estimated signals




Signal-to-Noise Ratio (SNR)




Euclidean distance =0.0011353

C. Simulation 3

To provide an experimental demonstration of this algorithm, we mix speech signals with random noise signal in interval and consider its efficiency. The simulation result is shown in Fig 9-11. Table 3 shows the kurtosis of source, mixed, estimated signals and Euclidean distance that refers to error. The experimental results provided in this case indicate effectiveness of the technique in reduction some of noises from speech signals such as white noise, factory noise and babble noise.

Kurtosis of source signals




Kurtosis of mixed signals




Kurtosis of estimated signals




Signal-to-Noise Ratio (SNR)




Euclidean distance =0.0015886


Comparison of original, mixed, separated signals and SNR value and separating error as Euclidean distance of simulation 3

Also this experiment redone on several speech signals with and without noise and mechanism of this algorithm is considered.

According to this experiment, it is clear that BSS based on continuous genetic algorithm can separates up to thirty speech signals successfully. Figure 12-13 shows the error of applying CGA and BGA to speech signals with and without noise. T is obvious that the result of separating based proposed method has fast convergence and high accuracy.

Fig. 9 Two speech signals and noise signals as sources

Fig. 10 Mixed speech signals with random noise

Fig. 11 Obtained speech signals and random noise

Fig. 12 The addition error diagram with increasing the number of speech sources in simulation of CGA and BGA without noise

Fig. 13 The addition error diagram with increasing the number of speech sources in simulation of CGA and BGA without noise

Figure 14 compares the best and the average of fitness function after 300 generations of a blind source separation based on CGA and BGA. The turbulence of BSS based on BGA causes that this algorithm converges with distortion. The abscissa represents 300 iterations and y-axis represents the best and average of cost of fitness function in each generation.

It is obvious that the CGA easily outperforms the BGA for solving this problem. The BSS based on GA fails to find the optimum with a population size of 80 in 300 generations. The BSS based on CGA on the other hand easily finds the optimum within 300 generations and usually finds the optimum within two hundred generations. It is shown in Fig.5 that the best solution and the average of solution are very close together for BSS based on CGA. In result, CGA transforms each population to the better population using suitable genetic operators based on fitness function correctly. The success of CGA relies on the definition of fitness function using kurtosis.

Fig. 14 The best and average of fitness over 300 generations for BSS based on CGA and BGA


In this paper, we evaluated the performance of genetic algorithm for a BSS problem from the linear mixtures of independent sources. Meanwhile, we compared the operation of binary and continuous genetic algorithm and concluded that continuous genetic algorithm is particularly effective and without special turbulence in converging towards separating of many sources from their mixtures. The proposed algorithm is based on high order statistics and kurtosis of the estimated signals in the fitness function of genetic algorithm. Also, the experimental results presented in this paper indicate the effectiveness of this method in reducing some noises from speech signals such as white noise, factory noise and babble noise. Also, it shows that there is no limitation on distribution of the original signals to enable the system to extract up to three sources from the observed signals. The proposed method overcomes the local minima problem occurred in the conventional gradient-based and neural network methods and can yield global optimal solutions to linear BSS problems.