Detection of cardio-respiratory disorders is significant for the prognosis of several medical conditions. The present techniques involve complex diagnostic procedures with need for automatic classification of acquired signals. This paper presents a novel approach for detection of cardio-respiratory disorders using the photoplethysmograph (PPG) signals acquired using simple medical setup which reduces cost and discomfort to the subjects. Signals are classified based on several time and frequency features extracted from the PPG signals. Decision tree based classification has been implemented and accuracy of 94.44% and 97.19% has been achieved for the induced cardiac stress and apnea conditions respectively. Based on the results, a preliminary diagnosis can be performed to detect the cause of abnormality in the recordings.
Photoplethysmography (PPG) is a non-invasive measurement technique, suitable for measuring blood volume changes in the micro-vascular bed of tissue.  PPG has been extensively used in different clinical settings such as monitoring of blood oxygen saturation, heart rate, cardiac output, blood pressure and respiration.  The measurement is performed by projecting visible or infra-red light on the surface of the skin and detecting the transmitted or reflected light from the blood vessels. The fluctuations in signal intensity may either be periodic or non-periodic, arising due to combined influence of perfusion pressure and sympathetic vascular control.  PPG signals can be acquired at various sites of the body such as fingers, earlobes, forehead or toes, allowing different possibilities for data acquisition protocols. 
Get your grade
or your money back
using our Essay Writing Service!
The PPG consists of two components - a slow, varying DC offset due to skin and electrode response, and an AC component, typically around 1 Hz, which reflects blood volume pulsations.  The amplitude fluctuations in the PPG signal are influenced by respiration and the activity of sympathetic nervous system which, in turn, are attributed to autonomous control of peripheral vessels. The forward pressure wave is created due to ventricular contraction. [6-8] It flows from the heart to the aorta and other smaller arteries, and is called the rising phase or systolic component. These waves are then reflected from the periphery at main branch points and they constitute the diastolic component. The point of reflection of the wave is characterized by the dicrotic notch, whose height is considered to be a measure of peripheral pressure wave reflection. [9-10] Pulse transition time (PTT) is the time interval between the systolic and diastolic peaks. Qualitatively, the systolic peak (anacrotic phase) corresponds to the heart condition and the diastolic peak (catacrotic phase) is used to determine elasticity and other features of the vascular system.  Fig. 1.1 shows the morphology of a typical PPG waveform.
Figure .1: Morphology of typical PPG waveform
The PPG signal shape contains certain coded information regarding the cardiovascular and respiratory state of the subject and a detailed shape analysis eventually provides clinical data for early detection of cardiovascular and respiratory abnormalities. The potential for extracting diagnostic information from the PPG has been reviewed.  Ageing and arterial diseases are said to have an effect on the variations of the AC component in PPG waveform. The peripheral pulse is used to assess the state of health and disease in subjects.  A weak or delayed response indicates signs of occlusive arterial diseases.  During apnea, vasoconstriction occurs and it is reflected in the PPG signal by a decrease in the fluctuation of amplitude. Several such quantitative measures that define the pulse shape have been used in the analysis of PPG waveforms. Time domain features such as PPG rise time, peak-to-peak time, peak amplitudes, shape and variability have been investigated. Pulse Rate Variability (PRV) exhibits frequency components from 0-0.5 Hz which are associated with the autonomic nervous system branches. Frequency components in the 0.15 -0.4Hz represent vagal tone and these frequencies are known as High Frequency (HF) components. Frequencies from 0.04-0.15 Hz manifest the activation of parasympathetic and sympathetic nerves and are labeled as Low Frequency (LF) components. The ratio between LF and HF is defined as the sympatho-vagal balance.  This ratio has been termed Power Ratio (PR). In this work, detailed spectral and time domain analysis has been carried out for classification of PPG.
Decision tree algorithm is a data mining technique that recursively partitions a data set using different methods until all the data item are classified. This phase is known as the tree building phase and is performed in a top-down manner. Another phase of classification is the tree-pruning phase which is performed in a bottom-up manner and is used to improve the classification accuracy of the algorithm. Using this technique, abnormalities associated with the respiratory and cardiac system can be detected with a good degree of accuracy.
Always on Time
Marked to Standard
MATERIALS AND METHODS
Data acquisition was performed on 45 healthy, non-smoking and non-athletic volunteers (25 Male and 20 Female subjects) without symptoms of cardiac or respiratory diseases. The subjects were seated during the examination with their hands laid comfortably on the thighs and were encouraged to keep their fingers still to prevent motion artifacts. First phase of acquisition was carried out for all subjects in resting state. The second phase of data acquisition was carried out for the subjects under different experimental conditions. For respiratory and cardiac case studies, 25 subjects were asked to perform breathe hold exercise and the remaining 20 subjects were asked to undergo physical exercise for 5 minutes prior to data acquisition. The subjects' physical attributes such as height and weight were also recorded for post-acquisition analysis. Studies were performed with approval of the Centre for Biomedical Research and Signal Processing, SSN College of Engineering, Chennai and consent was obtained from all volunteers before data was acquired.
The PPG signal was acquired using BIOKIT Physiograph (Version 4.1 Build 3), TekSys Electronics. LED and LDR, optical transmit-receive type finger sensor of wavelength 940nm, with input impedance of 1Mâ„¦ and a gain of -5K maximum was used for acquiring the data from the subjects. Frequency response was recorded at 2-40Hz. Casing included PCB mounted transmitter and receiver in a Velcro belt. The most ideal location for the PPG sensor was found to be the index finger of the hand because of the high signal strength and comfort of the subjects . The PPG signal was acquired from the right index finger. Subjects were made to sit in an upright position with the forearm placed in a relaxed position on the thigh. Care was taken to reduce motion artifact due to respiration. After a short resting period for stabilization, the PPG data was acquired post-prandial (about 30 minutes after food). A complete schema of the data acquisition procedure is presented in Fig. 2.1. During the procedure, the subjects breathed spontaneously at more than 12 cycles/min and the signals were recorded at 1000 Hz sampling frequency. Room temperature was regulated at 28 degree Centigrade with humidity at 50%.
Figure .1: Timing diagram for data acquisition
Feature extraction is very important for detection of abnormal patterns in the recorded bio-signals. Although several features from time and frequency domains can be extracted, the identification of key features which provide concrete evidence of variation from normal is essential for accurate classification. Combining several features from both domains makes the classification system more accurate and fool-proof.
2.3.1 Pre-processing: PPG signals have lesser sophisticated morphology when compared to other key physiological signals and thus feature extraction and peak detection are relatively simpler. But baseline drift and distortion may occur more frequently due to movement of the subjects or their physiological condition. It has also been demonstrated that fluctuations caused by respiration, sympathetic activity and even arousal changes such as drowsiness may cause baseline drift.  The major interferences affecting the PPG signals are motion artifacts, respiration and low perfusion.
These interferences highlight the need for preprocessing of PPG signals prior to the application of feature extraction algorithms. Baseline wandering can be removed using linear de-trending. Noise due to external factors and electrical interference is removed by applying a digital FIR band-pass filter of order 8. The cut-off frequency lies between 0.01-40 Hz. The recorded data are then segmented in 15 second intervals prior to feature extraction. All signal processing stages have been implemented using MATLAB (The Math Works Co. MATLAB© version 7.0).
2.3.2 Time domain: The morphology of the pulsatile component in PPG signal is said to change with physiology.  The analysis of signal morphology is significant as it is believed to contain information on the cardiovascular system and gives vital evidence for identifying clinical conditions such as diabetes, atherosclerosis and arterial stiffness.  Several time domain features have been extracted for the analysis of PPG signal. They can be classified as time and amplitude indices. The typical time domain indices of the PPG signal is shown in Fig. 2.2.
Figure 2.2: PPG waveform illustrating time domain indices
Stiffness Index (SI) is a measure of the arterial dispensability and is used to find age related problems such as arterial stiffness. It is also considered to be a surrogate to pulse wave velocity (PWV) measurements, which is used as a marker to indicate vascular damages.  SI is calculated in terms of subject's height and PTT.  The equation is as follows:
This Essay is
a Student's Work
This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.Examples of our work
PTT is the time interval between the systolic and diastolic peaks of the PPG signal. It reflects the transit time of pressure waves from root of subclavian artery to the point of reflection and back. The SI tends to increase with age as the PTT is quicker due to large artery stiffness.
The Reflection Index (RI) is mainly used to characterize changes due to apnea conditions as the amplitude differences between systolic and diastolic peaks is found to be suppressed. It is also considered to be a non-invasive marker for vascular assessment. RI is derived as a ratio of diastolic peak over systolic peak. The equation for RI is given by
Where, a and b are systolic and systolic and diastolic peak amplitudes respectively. RI can be used as a diagnostic tool for vascular age and arterial compliance. 
Pulse rate and Body Mass Index (BMI) were also taken into consideration as studies have shown that the physical attributes play a vital role in the properties of bio-signals.
2.3.3 Frequency domain: In addition to the time domain features, several frequency domain parameters were also considered for the purpose of classification. The combination of time and frequency features helps in accurate classification. Power Ratio (PR) is a bench mark parameter in evaluating the power distribution in the acquired signal.  The equation is given by:
Low frequency power is considered to be a quantitative marker for sympathetic modulations and sympathovagal activity. High frequency power is a measure of the cardiac parasympathetic vagal nervous activity. The different frequency bands have been illustrated in Fig. 2.3.
VLF: 0.003 to 0.04 Hz
LF: 0.04 to 0.15 Hz
HF: 0.15 to 0.40 Hz
Figure 2.: Power Spectral Density (PSD) of PPG waveform with dominant frequency bands illustrated
Other features such as width in the lower and higher frequency bands have been used to measure the density of the different frequency elements of the signal. Peak frequency and amplitude in both frequency bands have been recorded for further analysis.
Decision tree classifiers
Decision tree is a very popular data mining technique used for classification tasks. A decision tree is a classifier that can be stated as a recursive partition of the instance space which adopts a top-down learning system strategy [21-22]. It consists of root node, zero or more internal nodes, and one or more terminal nodes. Root node has no incoming edges. A node with outgoing edges is referred to as internal node, while all other nodes are called terminal nodes. In the decision tree, each internal node splits the instance space into several sub-spaces based on the attribute values. Each terminal node is assigned to one class representing the most appropriate target value. Instances are classified by navigating the nodes from the root node down to a terminal node, according to the outcome of the tests along the path. A classification rule in the decision tree represents the path from the root node to that specific terminal node. The important aspect to construct an efficient decision tree is to select the good splitting criteria. Gini diversity index is chosen as splitting criteria. The Gini impurity measure d (t) at node t is given as follows:
where S (the impurity criteria) = ∑ p2 (j | t), for j=0,1,2,….k. k denotes the number of classes existing in that node and p(j | t) corresponds to the relative frequency of class j in node t. The Gini diversity index of a node is biggest when all the class in the node occurs with equal probability and is minimal when the node contains only one target class [23-24]. The extraction of classification rules from the decision tree is shown in the Fig. 2.4.
Figure 2.: Extraction of rules from a decision tree 
RESULTS AND DISCUSSION
The volunteers participating in this study have a mean age of 20.43 years (range 19-22) and the mean Body Mass Index (BMI) was 21.1977 (range 19-25). The physical characteristic of the subjects from whom data has been acquired has been shown in Table 3.1.
Table 3.: Information on subjects participated in the study
Number of subjects
20.76 ± 0.44
20.09 ± 0.83
21.57 ± 1.97
20.83 ± 1.28
1.74 ± 0.07
1.60 ± 0.06
The signals were pre-processed to remove any artifacts. Peak detection algorithm was implemented to detect PPG complexes for time and frequency parameter computation. The visual inspection of PPG recorded under different conditions revealed certain changes in morphology. The signals and their PSD are presented in Fig. 3.1.
Figure 3.1: PPG waveforms under different stress conditions
By visual inspection, it can be observed that there occurs a significant change in waveform duration and amplitude fluctuation between normal and induced conditions. These morphology changes directly reflect on the SI and RI time domain features. There is also a significant change in energy distribution in the PPG signals under different conditions, as seen in the PSD plots. Tables 3.2 and 3.3 summarize the different features that have been exploited in this study.
Table 3.: Comparison of time domain features under different stress conditions
Stiffness Index (SI)
6.89 ± 0.88
6.37 ± 1.35
7.96 ± 0.86
Reflection Index (RI)
0.54 ± 0.12
0.69 ± 0.17
0.63 ± 0.17
Table 3.: Comparison of frequency domain features under different stress conditions
Power Ratio (PR)
2.11 ± 0.32
1.01 ± 0.08
0.61 ± 0.07
Width in LF Band
0.07 ± 0.10
0.19 ± 0.11
0.18 ± 0.12
Width in HF Band
0.03 ± 0.04
0.076 ± 0.01
0.06 ± 0.03
PCR in LF Band
0.13 ± 0.12
0.02 ± 0.01
0.06 ± 0.15
PCR in HF Band
0.06 ± 0.05
0.02 ± 0.01
0.02 ± 0.01
The PR, SI and RI features were selected for classification purposes using the decision tree classifier. The classification rules for induced cardiac stress and induced apnea is shown in Fig. 3.1 and 3.2 respectively.
Cardiac Decision Tree.png
Figure 3.1: Decision tree for classification of PPG recorded under induced cardiac condition
Respiratory Decision Tree.png
Figure 3.2: Decision tree for classification of PPG recorded under induced apnea condition
The number of nodes, number of rules and the accuracy of the classifiers is given in the table below.
Table 3.: Decision tree parameters
Induced cardiac stress condition
Induced apnea condition
Number of nodes
Number of rules
From the classification rules, PR was found to be an important predictor for the cardiac condition and RI for the respiratory condition. The derived rules cannot be explained fully based on the standard medical knowledge, since decision tree can also discover unimportant rules.
This paper discusses a novel detection mechanism for cardiac-respiratory disorders using a decision tree data mining approach. Time and frequency domain features were applied and an automated rule based system using a decision tree was implemented. From the proposed study, it can be observed that data mining approach found to be yielding promising results compared to neural network based classification approach. A classification accuracy of 94.44% and 97.19% was obtained for cardiac and respiratory conditions. In order to improve the classification efficacy, several cross validation procedure with other data mining approaches are currently under investigation. Further, the algorithm needs to be validated under real life situations to understand the efficiency of the current classification algorithm.
The authors would like to acknowledge Dr. Mahesh V., Department of Biomedical Engineering, SSN College of Engineering, Chennai, India for the PPG acquisition procedure.