This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
An electrocardiogram signal represents the sum total of millions of cardiac cells depolarization potentials. It helps to identify the cardiac health of the subject by inspecting its P-QRS-T wave. The Heart Rate Variability (HRV) data, extracted from the ECG signal, reflects the balance between sympathetic and parasympathetic components of the autonomic nervous system. Hence, HRV signal contains information on the imbalance between these two nervous system components that results in cardiac arrhythmias. Hence in this paper, we have analyzed HRV signal abnormalities to determine and classify arrhythmias. The HRV signals are non-stationary and non-linear in nature. In this work, we have used Continuous Wavelet Transform (CWT) coupled with Principal Component Analysis (PCA) to extract the important features from the heart rate signals. These features are fed to the Probabilistic Neural Network (PNN) classifier, for automated classification. Our proposed system demonstrates an average accuracy of 80%, sensitivity and specificity of 82% and 85.6%, respectively, for arrhythmia detection and classification. Our system can be operated on larger data sets. Our CWT-PCA analysis results in eigenvalues which constitute the HRV signal analysis parameters. We have shown and plotted the distribution of the parameters' mean values and the standard deviation for arrhythmia classification. We found some overlap in the distribution of these eigenvalue parameters for the different arrhythmia classes, which mitigates the effective use of these parameters to separate out the various arrhythmia classes. Hence, we have formulated an HRV Integrated Index (HRVID) of these eigenvalues, and determined and plotted the mean values and standard deviation of HRVID for the various arrhythmia classifications. From this information, it is seen that this HRCID is able to distinctly separate out the various arrhythmia classes. Hence, we have made a case for the employment of this HRVID as an index to effectively diagnose arrhythmia disorders.
Keywords: heart rate; heart rate variability signal; Principal Component Analysis (PCA); Probabilistic Neural Network; PCA eigenvalues; classification of arrhythmia; HRV integrated index for arrhythmia detection.
The heartbeat is originated from the Sino-Atrial (SA) node of the heart. The SA node is a group of specialized cells, which continuously generate the electrical impulse, which then spreads as a depolarization wave through the myocardium of atria and ventricles, along specific pathways. This makes the atria and ventricles contract synchronously to take in blood and eject blood. Disturbance in this synchrony is manifested as arrhythmia, which can be detected by analysis of the Heart Rate Variability (HRV) signal.
The Autonomic Nervous System (ANS) non-voluntarily controls all internal organs of the body. Due to the continuous control of ANS over the SA node output, the SA node generates impulses at about 50-80 beats per minute in healthy subjects at rest. The autonomic nervous system has two branches, namely sympathetic and parasympathetic (vagal) nervous system. The sympathetic nervous system stimulates the functioning of organs, including the heart. Stimulation of the sympathetic nervous system causes an increase in HR and stroke volume along with systemic vasoconstriction.
But, the parasympathetic nervous system behaves in the opposite way and it tries to inhibit functioning of those organs. Increase in the stimulation of the parasympathetic system results in decrease of HR and stroke volume along with systemic vasodilatation. Both the sympathetic and parasympathetic systems will be active during rest. Perfect balance between them enables the heart to function optimally in response to internal and external stimuli.
The ECG signal constitutes the electrical activity of the heart, and originates at the SA node. By placing sensors on the limb extremities of the subject, the ECG signal can be extracted. The cardiac health of the subject can be diagnosed after observing the shape of the ECG signal1. These ECG signals are non-stationary and non-linear in nature. Hence, it is somewhat difficult to visually observe the subtle changes in these signals.
The autonomic control of the cardiovascular system can be reflected by the heart rate variability 2, 3. The HRV is an indicator of the dynamic interaction and balance between the sympathetic nervous system and the parasympathetic nervous system. The HRV signal can be analyzed (as a noninvasive cardiac diagnostic procedure) by employing (i) time domain analysis and (ii) the frequency domain analysis.
Time domain methods of HRV are based on statistical measures and are simple to use. Many parameters can be calculated from the original heart rate signals: the Standard deviation of the NN intervals (SDNN), Standard Error (SENN) of NN intervals, Standard deviation of differences between adjacent NN intervals (SDSD), Root mean square successive difference of intervals (RMSSD), Number of successive difference of intervals that differ by more than 50 msec (pNN50%). These statistical parameters can be used as time domain parameters 4.
In the frequency domain, there are three main frequency regions 2 in a typical power spectrum of the heart rate signal. The power spectrum of HRV signal consists of three major frequency bands ranging from 9 to 0.5 Hz 4. In the Low Frequency (LF) band (0.02-0.05 Hz), variations are related to temperature regulation of the body, the vasomotor control and the rennin-angiotensin system. A very low frequency (VLF) band (0.01-0.04 Hz) has the influence of sympathetic system. Mid Frequency (MF) band (0.05-0.14 Hz) variations are related to the arterial blood pressure control system. It is influenced by parasympathetic and sympathetic systems. High Frequency (HF) band (0.14-0.5 Hz) is related to respiration associated with parasympathetic activity which varies with the respiratory rate. This band is mediated solely by the parasympathetic system.
In our previous work, instead of using linear methods, we have employed nonlinear methods to unveil the hidden information in the signal due to the nonlinear nature of the signal 5, 6, 7. The heart rate signals have been thus analyzed and classified into eight cardiac states, by using non-linear techniques and artificial intelligence, with an accuracy of 85% 8.
Bispectrum invariant features and phase entropies have been used by Chua et al 7 to study the cardiac arrhythmia, by using heart rate as base signal They have proposed different bispectrum and bicoherence plots, and classified normal and other four other classes with an average accuracy of above 85%.
New dynamic methods of HRV quantification have been used to unearth nonlinear fluctuations in heart rate that are not otherwise obvious. These methods include: Lyapunov exponents 9, correlation dimension 10, 1/f slope 11, approximate entropy (ApEn)12 and detrended fluctuation analysis 13.
AutoRegressive modeling (AR) technique was used by Ge et al. 14, to classify Normal Sinus Rhythm (NSR) and other cardiac arrhythmias, namely Atrial Premature Contraction (APC), Premature Ventricular Contraction (PVC), Superventricular Tachycardia (SVT), Ventricular Tachycardia (VT), and Ventricular Fibrillation (VF). They have successfully classified NSR, APC, PVC, SVT, VT, and VF with an accuracy of 93.2% to 100%, using the GLM (Generalized Linear Model) based classification algorithm.
We have used different linear and non-linear methods to analyze eight types of cardiac classes namely: normal, premature ventricular contraction, atrial fibrillation, complete heart block, sick sinus syndrome, left bundle branch block, ventricular fibrillation, ischemic/dilated cardiomyopathy8. For this classification, specific ranges of values for different linear and non-linear parameters have been employed, with 'p' value less than 0.001 (clinically significant).
In this work, Continuous Wavelet Transform (CWT) was first used to convert the heart rate signal from its time domain into its frequency domain. CWT is able to give a time-frequency representation of the signal. It is important to choose the section of the heart rate signal in the time domain which stores most of the information of the signal, before extracting features from the signal. Then, the Principal Component Analysis (PCA) has been performed on the CWT coefficients, in order to extract the eigenvalues.
In this paper, the features derived from PCA are used for the purpose of classification, using PNN classifier. The complete method used for the automatic identification of the cardiac diseases has been explained. The layout of this paper is as follows: The data acquisition process and preprocessing of the raw cardiac signals are presented in Section (2). Brief descriptions of CWT, PCA, PNN, and other statistical techniques used are given in Section (3). The results of the analysis are presented in Section (4). Section (5) contains the discussion of our data analysis, and the paper concludes in Section (6).
2. Data acquisition process
The ECG data necessary for this study was acquired from Kasturba Medical Hospital, Manipal, India. ECG recording of 15-min duration was stored in the Holter, when the patient was lying down comfortably. Prior permission from the staff of this hospital was sought before the ECG recording. The analogue data was digitized by using sampling frequency of 320 samples per second. The number of datasets, recorded in each class of patients, is shown below in Table 1.
Table . Number of Heart Rate data sets for different cardiac health states. In this table, we have NSR: Normal Sinus Rhythm, AF: Atrial Fibrillation, PVC: Premature Ventricular Contraction, CHB: Complete Heart Block, VF: Ventricular Fibrillation.
Number of datasets
2.1. Cardiac Heart Rate Rhythm Classes
In this work, the cardiac data is classified into the following five classes.
Normal Sinus Rhythm (NSR)
Atrial Fibrillation (AF)
Premature Ventricular Contraction (PVC)
Complete Heart Block (CHB)
Ventricular Fibrillation (VF)
A brief description of the different cardiac classes is given below.
Normal Sinus Rhythm (NSR) is generated by the sinus node, and travels in a normal fashion in the heart. In the typical ECG signal, P waves are first observed. After a brief pause (of less than 20 seconds), a QRS complex is observed, and finally a T wave. The P wave morphology and axis must be normal. The PR interval ranges between 120 ms to 200 ms. The NSR is characterized by a usual rate of any value between 60-100 bpm.
Atrial Fibrillation (AF) constitutes random activation of different parts of the atria at different times, due to multiple patterns of electrical impulses travelling randomly through the atria. In the ECG signal, AF is noted by absence of P waves and irregularity of R-R interval. These may be due to irregular conduction of impulses to the ventricles. The heart rate for patients with AF may range from 100 to 175 beats per minute.
In Pre-ventricular Contraction (PVC), the regularity of the underlying rhythm of the heart is interrupted, as the heartbeat comes earlier than expected and causes problems outside the sinus atrial node. In the ECG signal, the QRS complex is not only widened, but it is also not associated with the preceding P wave. Usually, the T wave is observed to be in opposite direction from the R wave. Two consecutive PVCs exist in couplets, and a compensatory pause usually follows a PVC signal.
Complete Heart Block (CHB) is a disease of the heart's electrical system, which does not enable the electric signals of the heart to pass from the upper to the lower chambers. Thereby, all the impulses generated from the sinus node in the right atrium are not conducted into the ventricles. This causes ventricles to contract and pump the blood at a slower rate. Hence, there is a reduction of the heart rate (as low as 30 beats per minute). In patients with CHB, there is no normal relationship between the P and the QRS waves in ECG.
Ventricular Fibrillation (VF) is a condition which causes the heart's electrical activity to become disordered, causing the heart to fibrillate very rapidly (sometimes at 350 beats per minute or more). VF commonly occurs in severe coronary artery disease and cardiac arrest patients. In VF, instead of pumping blood, ventricular muscles contract randomly, causing complete failure of ventricular function. It is observed that there is irregular chaotic electrical activity in the ECG. There is usually no recognizable QRS complex in patients with VF.
The preprocessing of the ECG signals involves the following five steps.
Usually, there is the presence of unwanted high frequencies in the ECG signal. A low pass filter with a cut-off frequency of 35Hz is applied on the data in order to remove the unwanted high frequencies.
In order to suppress the baseline wander that is present in the signal, a high pass filter with cut-off frequency 0.3Hz is used.
To suppress the power-line interference noise, a band-stop filter of cut-off frequencies 50 or 60Hz is used.
A median filter is used to extract the baseline wander present in the processed ECG signal. In order to effectively remove the baseline wanders, the extracted signals are subtracted from the processed ECG signal.
Tompkins algorithm15, 16 is employed on the ECG data in order to detect the R peaks of the ECG.
Between two successive QRS complexes, there exists an interval which can be being defined as the RR interval (tr-r seconds). The heart rate (in beats per minute) is given by the following equation:
HR = 60/tr-r (1)
3. HRV signal analysis methods
The HRV signals are analyzed in the wavelet domain. In this project, CWT has been used to transform the signal into wavelet domain. Principal component analysis (PCA) is performed on the extracted CWT coefficients, in order to obtain the first three eigenvalues. The first three eigenvalues of PCA are then used as input to the PNN classifier. Fig. 1 shows the block diagram of the proposed system.
Heart Rate Signal
Fig.1. Proposed System
3.1. Continuous Wavelet Transform (CWT)
Fourier transform techniques are not suitable for analysis of non-stationary bio-signals. The Fourier transform uses complex exponential functions of infinite duration to represent those bio-signals of finite interval. Hence, it is not suitable for our analysis. Wavelet analysis, on the other hand, provides a better insight into both the timing and intensity of transient events.
There are two types of wavelet analysis, one of them is CWT and the other is Discrete Wavelet Transform (DWT)17. DWT is a sampled version of the CWT in a dyadic grid. Therefore, the wavelet coefficients are calculated for discrete values of translation factors and scale factor - the increments are in the dyadic scale18.
'Wavelet' is a small wave with finite energy and finite duration that is correlated with the signal to obtain the wavelet coefficients19, 20. The reference wavelet is known as the mother wavelet, and in this work, we have used the Morlet (morl) wavelet (Fig. 2).
File:Wavelet - Morlet.svg
Fig.2. Morlet Wavelet (morl).
The Continuous Wavelet Transform of a signal f(x) is given by
The mother wavelet is given by
where and s are called translation and scale parameter. The transformation is obtained from the mother wavelet through scaling and translation. The scale parameter corresponds to the frequency information. Scaling either expands or compresses the signal. Large scale (low frequency) will either dilate or expand the signal. This will provide detailed hidden information in the signal. Small scaling (high frequency) will cause the signal to be compressed and this will provide global information of the signal. The translation parameter refers to the location of the wavelet function as it is being shifted through the analyzed signal.
Mathematically, continuous wavelet transform is being defined as the sum over all time of the signal f(t) multiplied by shifted and scaled versions of the analyzing wavelet. The end result of the continuous wavelet transform will be a series of wavelet coefficients which are a function of these shift and scale parameters. The resulting signal is known as scalogram.
3.2. Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is a technique that can be used to reduce multi-dimensional data into lesser dimensions without much loss of information of the data. It is one of the most successful techniques that have been used in data recognition and compression. PCA is able to reduce the dimension of the data by identifying the patterns which are expressed in terms of their differences and similarities by performing a covariance analysis. This will describe the strong correlation relationship between the observed variables in a multi-dimensional data set. Covariance, as given in the below Eqn. (4), is used to measure the variation of the dimensions from the mean with respect to each other.
where the terms or refers to the mean of set Y of or set X respectively while the total number of elements is represented by n.
The process of subtracting the mean value from each of the data dimensions is performed by PCA. The mean value is the average for respective dimensions. Hence, this will cause all the y values to have and all the x values to have. The result will be a zero mean for the data set, and the covariance matrix is calculated. From the covariance matrix, the eigenvectors and eigenvalues, which provide the information about the patterns in the data, are calculated. The eigenvalue can be defined as a scalar of a square matrix; it is a requirement for an eigenvector to be a non-zero vector. Eigenvalue and eigenvector usually will come in a pair. Given a complex square matrix A and a non-zero complex column vector X, is a complex number that has to satisfy AX=X. X (this equation is known as the eigenvector) and will be known as the eigenvalue of matrix A. The eigenvectors will appear on the diagonal lines and they are perpendicular to each other. Therefore, this will allow data to be expressed in terms of these perpendicular eigenvectors instead of expressing it in terms of the x and y axes which will show how the data sets are related along the lines. The x and y axes will not show exactly how the points are related to each other in the data. Therefore, we will be able to extract the lines which characterize the data by calculating the eigenvectors of the covariance matrix.
Eigenvalues of the principle component of the data set correspond to eigenvectors with the highest values. This shows the most significant relationship between the data dimensions. All eigenvectors have to be rearranged from highest to lowest eigenvalues, once they were calculated from the covariance matrix. Herein, the eigenvalues obtained after performing PCA on the CWT coefficients were used to provide the patterns in the data set.
3.3. Quantitative analysis
In this work, all the features described above are subjected to Analysis Of Variance (ANOVA), in order to determine whether their means are different. The ANOVA, sometimes known as the F-test which is closely related to the t-test, is a statistical method which uses variances to determine whether the means are different. The main difference between ANOVA and the t-test is that ANOVA measures the differences between the means of two or more groups while the t-test measures the differences between the means of two groups. If the ANOVA test gives us high observed differences, it is considered to be statistically significant. The p-value (in Table 2) was obtained by using the ANOVA test.
3.4. Probabilistic Neural Network (PNN)
PNN is a special type of neural network learns to approximate the probability density function (pdf) of the training data. They are a kind of two-layer radial basis network suitable for classification problems. When an input is presented, the first layer (radial basis layer) computes distances from the input vector to the training input vectors and produces a distance vector whose elements indicate how close the input is to a training input. The second layer (competitive layer) sums these contributions for each class of inputs to produce a vector of probabilities as its net output. Then the compete transfer function on the output of the second layer picks the maximum of these probabilities, and assigns a class label 1 for that class and a 0 for the other classes.
4.1. Eigenvalues for Normal Sinus Rhythm and Arrhythmias
Table 2 shows the range of three eigenvalues () for the five classes. The distribution of three eigenvalues for the five classes is shown in Figs. 3(a) - (c). The result of ANOVA with eigenvalues obtained from PCA, for various kinds of normal cardiac condition and cardiac diseases, is listed in Table 2. These values are clinically significant because the 'p' values are very less (<0.0001).
The heart rate will vary continuously between 60 bpm and 80 bpm for normal sinus rhythm (NSR). Since there is higher variation in the heart rate, the eigenvalues () appear to be high in Table 2. For NSR, the mean values of eigenvalues () are -1090.8, -53.530, and -40.584 respectively. There may be a possibility that these values are related to the rate of breathing and its harmonics, as we have to take into consideration the modulating effect on the heart rate variability due to the breathing pattern. The heart has to work harder in order to meet higher body demands. Hence, the HRV will be high.
Table . Results of ANOVA for the three eigenvalues of the five classes of heart rate rhythms (for normal sinus rhythm and four classes of arrhythmias); the entries in the columns (other than in the last column) correspond to mean ± standard deviation.
For CHB, the heart rate variation is low, due to inability of the Atrio-Ventricular (AV) node to send electrical signals rhythmically to the ventricles. As compared to NSR, there is a reduced beat-to-beat variation for CHB, as indicated in Table 2 by the mean values of eigenvalues (). The mean values of eigenvalues () are -457.00, -25.088, and -11.593, respectively.
In the case of ventricular fibrillation (VF), the heart fibrillates very rapidly, causing the heart rate variation to be high. As compared with NSR, the mean values of eigenvalues () are higher for VF, as indicated in Table 2. The mean values of eigenvalues () are -1644.8, -92.155, and -53.558 respectively.
For the atrial fibrillation (AF), there is random activation of different parts of the atria at different times. As compared to NSR, the mean values of eigenvalues () are higher for AF, but lower than VF, as indicated in Table 2. The mean values of eigenvalues () are -959.15, -42.151, and -28.939, respectively.
In the case of premature ventricular contraction (PVC), there is an ectopic beat beginning from one of the ventricles. The eigenvalues () are higher and their mean values are -1061.2, -59.39, and -6.93 respectively. Fig. 3 indicates the distribution of () features.
Fig.3. Distributions of eigenvalues extracted from the heart rate signals:
(a) Eigenvalue 1 (), (b) Eigenvalue 2 (), (c) Eigenvalue 3 ()
4.2. Classification results
The number of samples used for training and testing is presented in Table 3. Table 4 shows the results of the classification efficiency of the PNN classifier obtained using the 10-fold cross validation method. The results indicate that our PCA method can be used for the detection of the unknown cardiac class with an average accuracy of about 80%, specificity of 85.6%, and sensitivity of 82%.
Table . Number of training and testing samples used in each class.
Table . Average values of the number of True Negatives (TN), False Negatives (FN), True Positives (TP), False Negatives (FN), accuracy, Positive Predictive Value (PPV), sensitivity, and specificity over the ten folds for the PNN classifier.
4.3. Integrated Index for Heart Rate Variability
In spite of the eigenvalues having high degree of efficiency, sensitivity and specificity, it can be seen (from Fig. 3) that the distribution of these eigenvalues parameters exhibits considerable overlap for some arrhythmias. Hence, we cannot employ these eigenvalues parameters to specifically distinguish the four arrhythmias from one another and from the normal sinus rhythm. So now, based on the concept of a single Physiological Index number21, 22, we propose a new integrated index, called HRVID Index, using , as given by the below equation.
Table 5 shows the range of HRVID index values for different cardiac states. It is seen that this HRVID Index can effectively separate out the different arrhythmia classes from the normal sinus rhythm (NSR) category. Also, we can clearly distinguish VF and CHB from each other as well as from AF and PVC. Then, the eigenvalue can be used to further separate PVC and AF.
Table . Range of HRVID values for NSR and four arrhythmia states: AF, CHB, PVC, VF.
We now plot the distributions of this HRVID Index for normal and arrhythmias classes, in Fig. 4. It is noted from the figure, that there is effective separation of AF, PVC, CHB, and VF from NSR. Also, there is effective separation of VF and CHB from each other as well as from AF and PVC. However, there is no significant separation between AF and PVC. So we are recommending that this HRVID Index be employed to first decide the presence of NSR or VF or CHB. However, if, based on the HRVID index, there is evidence of presence of AF or PVC, then we should employ the value of the eigenvalue to decide between AF and PVC. This procedure can be implemented clinically, by means of a Decision Tree (or a Neural Network) incorporated into the Electronic Medical Records.
Fig. 4. Variation of HRVID index values for different cardiac states.
In order to classify the cardiac arrhythmia categories using the heart rate variability signals, different non-linear methods have been applied over the years 23-25. In these works, different non-linear parameters such as Lyapunov exponent, correlation dimension, fractal dimension, approximate entropy, detrended fluctuation analysis, and Hurst exponent were used to identify the unknown arrhythmia class using the heart rate variability signal. Table 6 shows the comparison of arrhythmia classification work carried out, and the accuracy of classification.
Table . Comparison of studies that conducted arrhythmia classification with non-linear features
No. of classes
Acharya et. al. 5
Acharya et. al. 8
Kannathal et. al 25
Acharya et al 26
Chua et al 7
Ge et al
Non-linear parameters, such as spectral entropy, Poincare plot geometry and Largest Lyapunov exponent were fed into ANN and Fuzzy classifiers, for automatic classification. By means of these parameters, an average accuracy of 95% for four classes5, and 85.36% accuracy for eight classes were obtained8.
The features SD1/SD2 (low range variability/long range variability), largest Lyapunov exponent and Hurst exponent coupled with adaptive neuro-fuzzy fuzzy inference system (ANFIS) classifier was able to classify ten cardiac classes with an accuracy of 94% correctly25. Nine cardiac classes were automatically identified using first three peak amplitudes and corresponding frequencies from Fast Fourier transform (FFT), Auto regressive (AR), Auto regressive moving average (ARMA) and Moving average (MA) modeling techniques26. Their results show that the ARMA modeling technique, performed better than the other methods, with an accuracy of 83.83%.
Using AR modeling coefficients and GLM (generalized linear model) classification algorithm, Ge et al.14 have categorized six cardiac classes with accuracy of 93.2% to 100%. Recently, Chua et al., have classified heart rate into five classes, by using HOS features and support vector machine (SVM) classifier with an accuracy of 85%.7
In our present work, we have used eigenvalues of PCA and PNN classifier to catalog five classes with an accuracy of 80%. This classification accuracy can be further increased with better features, diverse data and more robust training. Then, using the newly developed and trademarked concept of a single physiological index number21,22, we have further formulated a novel integrated index, which combines the three eigenvalues into a single HRVID Index. This index is so formulated that it enables effective separation of (i) the four arrhythmia classes from the normal sinus rhythm category, as well as (ii) of VF and CHB arrhythmias. However, detection of AF and PVC is to be made on the basis of the value of eigenvalue.
Using nonlinear methods like correlation dimension27, fractal dimension28, detrended fluctuation analysis29, higher order spectra30, higher order cumulants31 and recurrence quantification32 analysis, the classification efficiency can be further increased. These heart rate signals and patient information can be interleaved within the images with the different error correcting codes in a noisy environment without affecting the hidden information.33,34
Heart rate can be used as a reliable indicator of cardiac diseases. In this work, we have extracted three eigenvalues (), after performing PCA from five different types of heart rate signals for automated classification using PNN. Our proposed system is able to identify the unknown cardiac rhythm class with an accuracy of 80%, sensitivity and specificity of 82% and 85.6% respectively. The accuracy of our system can be further increased by huge diverse training data and better features, and the rigor of the training imparted.
We have also proposed a new HRVID Index, to more accurately identify the different cardiac arrhythmia states from the normal sinus rhythm state. The different ranges of this Index are indicated, to effectively categorize cardiac normal rhythm and cardiac arrhythmia states. As indicated, we are recommending that this novel index be employed to first determine the presence of NSR, VF, CHB and AF or PVC. Then, detection of AF and PVC can be made on the basis of the value. For clinical application, a Decision Tree can be used to implement this procedure.
Conflict of interest statement