Multiresolution based feature extraction method for breast cancer diagnosis in digital mammogram

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

1. Introduction

Cancer is a leading cause of death worldwide; it is accounted for 7.4 million deaths (around 13% of all deaths) in 2004. More than 70% of all cancer deaths occurred in low and middle income countries. Deaths from cancer worldwide are projected to continue rising, with an estimated 12 million deaths in 2030 (WHO, 2009). Breast cancer is the first one of the major concerns deaths among women. According to published statistics of World Health Organization (WHO) there are 519,000 deaths from this disease in 2004 (WHO, 2009). Early detection and treatment are considered the most promising approach to reduce breast cancer mortality (Cheng et al., 2003).

Digital mammography is one of the most suitable methods for early detection of breast cancer. However, the visual clues are subtle and vary in appearance, making diagnosis difficult and challenging even for specialist (Verma and Zhang, 2007). A false positive detection causes unnecessary biopsy. It has been estimated that only 20-30% of breast biopsy cases are proved to be cancerous (Mousa et al., 2005; Soltanian et al., 2004). On the other hand, in a false negative detection, an actual tumor remains undetected. Studies have shown that 10 - 30% of the visible cancers are undetected (Christoyianni et al., 2002). Thus, there is a significant necessity for developing methods for automatic classification of suspicious areas in mammograms for aiding radiologists to improve the efficacy of screening programs and avoid unnecessary biopsies.

Computer aided detection (CAD) systems, which use computer technologies to detect abnormalities such as microcalcification, mass, architecture distortion and asymmetry, can play a key role in early detection of breast cancer and help to reduce the mortality rate among women with breast cancer (Tang et al., 2009). Computer aided methods in the field of digital mammography are divided into two main categories: computer aided detection methods that are capable of pinpointing suspicion regions in mammograms for further analysis from an expert radiologist and computer aided diagnosis methods which are capable of making a decision whether the examined suspicion regions consist of abnormal or healthy tissue and distinguishing between benign and malignant (Christoyianni et al., 2002).

Computer aided detection systems (CAD) for detecting masses or microcalcifications in mammograms have already been used and proven to be a potentially powerful tool (Ranjayyan et al., 2007). The radiologists are attracted by the effectiveness of clinical applications of CAD systems. CAD systems still need to be improved to meet the requirements of clinics and screening centers (Tang et al., 2009).

One of the main points that should be taken under serious consideration when implementing a robust classifier for recognizing breast tissue is the selection of the appropriate features that describe and highlight the differences between abnormal and normal tissues in an ample way (Christoyianni et al., 2002). Feature extraction is an important factor which directly affects the classification result. Features are extracted for each suspected area representing textures, statistical properties, spatial domain, fractal domain or wavelet bases (Fu et al., 2005). Most systems extract features to detect abnormalities and classify them as benign or malignant. The classification of malignant and benign is still a difficult and challenging problem for researchers (Ferreira and Borges, 2003).

There are various feature extraction methods that serve to condense input data and to reduce redundancies by highlighting important characteristics of the image. The features of digital images can be extracted directly from the spatial data or from a different space after using a transform such as Fourier transform, wavelet transform or curvelet transform.

Multiresolution analysis provides a very sparse and efficient representation for images. In recent years, several schemes for mammogram analysis using wavelet were introduced. Liu et al. (2001) proved that the use of multiresolution analysis of mammograms improves the effectiveness of any diagnosis system based on wavelet coefficients. In their mammogram analysis study, they used a set of statistical features with binary tree classifier in their diagnosis system to detect the spiculated mass. The achieved successful rate was 84.2%. Mousa et al. (2005) proposed a system based on wavelet analysis. They used an adaptive neuro-fuzzy inference system (ANFIS) for building the classifier to distinguish normal from abnormal and to determine whether the type of abnormality is mass or microcalcification. The maximum classification rate obtained was 85.4%. Rashed et al. (2007) studied the multiresolution analysis of digital mammogram using wavelet transform to extract a fractional amount of the biggest coefficients. They used daubechies-4,-8,-16 wavelet functions with four levels of decomposition. Euclidian distance was used to classify between microcalcification clusters, spiculated mass, circumscribed mass, ill-defined mass and normal mammogram. The maximum classification rate achieved was 87.06%. Ferreira and Borges (2003) proposed a system to classify mammogram images by transforming the images into wavelet bases and then using a set of the biggest coefficients from first level of decomposition as the feature vector toward separating microcalcification clusters, spiculated mass, circumscribed mass and normal classes of images. The maximum classification rate achieved was 94.85%.

Moayedi et al. (2007) presented a study of contourlet based mammography mass classification using support vector machine (SVM). In their study, a set of statistical properties of contourlet coefficients from 4 decomposition levels, co-occurrence matrix features and geometrical features are used as feature vector of region of interest (ROI). Genetic algorithm was used for feature selection based on neural network pattern classification. They concluded that the contourlet features offer an improvement of the classification process. Eltoukhy et al. (2009) presented a study of mammogram classification based on curvelet transform. A fractional amount of the biggest coefficients from each decomposition level is used as feature vector. They proved that multiresolution analysis based achieved an interesting result.

In this paper a multiresolution representation based feature extraction method is proposed and tested using a set of images provided by the MIAS (Suckling et al., 1994). The objective is to get the best features that could represent mammograms. Firstly, both multiresolution analysis methods wavelet and curvelet are used to decompose each of the mammogram images separately. Then, from the transform of each mammogram, a set of coefficients is extracted. Finally, a nearest neighbor classifier based on Euclidian distance is constructed. Three problems have been considered. The first problem is to differentiate between normal and abnormal mammograms. The second problem is to classify the type of abnormality into microcalcification clusters, spiculated mass, circumscribed mass, ill-defined mass, architectural distortion and asymmetry. The third problem is to distinguish between benign and malignant tumors.

The remaining of this paper is organized as it follows. Section 2 gives a brief introduction to wavelet and curvelet representations. Section 3 discusses the proposed method followed by implementation in section 4. Results and discussions are presented in section 5, while section 6 contains the conclusion.

2. Preliminaries

2.1. Wavelet Transform

2.1.1 Multi-resolution and one dimensional wavelet representation:

Z and R denote the set of integers and real numbers respectively. The original signal is measurable and has finite energy:. The multiresolution approximation of at a resolution is defined as the orthogonal projection of a signal on the vector space of. The can be interpreted as the set of all possible approximations at the resolution 2j of functions in, which is detailed in (Mallat, 1989). The approximation at resolution contains more information than the approximation at resolution. The details signal of at resolution is denoted by. The details can be defined as the difference between and. is equivalent to the orthogonal projection of on the complement of vector space in. According to the theory of multiresolution signal decomposition (Mallat, 1989), there exists a unique scaling function and a unique corresponding wavelet function, where and, such that and are orthogonal bases of and respectively. The approximation and detail signals of the original signal at resolution are completely characterized by the sequence of inner products of with and as follows:



Let H be a low-pass filter and G be a high-pass filter, where the impulse response of the filter H is, and the impulse response of the filter G is. Define with impulse response to be the mirror filter of H, and with impulse response to be the mirror filter of G. The multi-resolution representation of at any resolution can be implemented by a pyramidal algorithm as shown in

2.1.2 Two-dimensional wavelet representation

The wavelet model can be extended to two dimensional signals by separable multiresolution approximation of with scaling function. There are three associated wavelet functions, and, where is the one-dimensional wavelet function associated to. With this formulation, the wavelet decomposition of a two dimensional signal can be computed with a separable extension of the one-dimensional decomposition algorithm as shown in

illustrates the multilevel decomposition of the image into, , , and in the frequency domain. The images, , , and correspond respectively to the lowest frequencies, the vertical high frequencies (horizontal edges), the horizontal high frequencies (vertical edges) and the high frequencies in both directions (diagonal), i.e. = +++. This set of images is called an orthogonal wavelet representation in two dimensions (Mallat, 1989). The image is the coarse approximation at the resolution, and the images, and give the detail signals for different orientations and resolutions. If the original image has N pixels, then each of the images, and will have pixels (j > 0), so that the total number of pixels in this new representation is . This process can be summarized as it follows. An image is decomposed into orthogonal sub-bands with low-low (LL), low-high (LH), high-low (HL), and high-high (HH) components which correspond respectively to approximation, horizontal, vertical and diagonal. The LL sub-band is further decomposed into another four sub-bands. The low-low-low-low (LLLL) component represents the image approximation at this level, and the process can be continued (Al-Qdaha et al., 2005).

2.2. Curvelet Transform

The discrete curvelet transform was proposed by Candes and Donoho (2000), from the idea of representing a curve as superposition of functions of various length and width obeying the curvelet scaling law

The second generation of curvelet transform is presented in (Candes et al., 2006). The work is done throughout in two dimensions, i.e. with x as spatial variable, ϰ as frequency domain variable, r and θ as polar coordinates in the frequency domain. A pair of windows and are defined as the radial window and angular window respectively. These are smooth, nonnegative and real-valued, with W taking positive real arguments and is supported on and V taking real arguments and is supported on. These windows will always obey the admissibility conditions:



For each, a frequency window is defined in the Fourier domain by


Where is the integer part of. Thus the support of is a polar wedge defined by the support of W and V, applied with scale dependent window widths in radial and angular directions. The symmetriezed version of (2.7), namely, is used to obtain real valued curvelet.

The waveform is defined by means of its Fourier transform. Let be the window defined in the polar coordinate system by (2.7). is the mother curvelet in the sense that all curvelet at scale are obtained by rotations and translations of. A sequence of translation parameters and Rotation angles are introduced, with such that (the spacing between consecutive angles is scale-dependent). The curvelet functions are functions of defined at scale, orientation angle and position

where is the rotation by θ radians and is its inverse,

A curvelet coefficient is the inner product of an element and a curvelet,

Curvelet transform obeys an anisotropy scaling relation, , , such that. Fast digital curvelet transform can be implemented via two methods, using unequispaced FFTs or using wrapping (Candes et al., 2006). In this study, the method of unequispaced FFTs is used.

3. Proposed Method

The method for extracting features has two phases, so the dataset was divided into two sets and. The set contains images from all classes. The first phase consists on building the basis of the classifier from the set which contains labeled images. The images are decomposed using wavelet or curvelet transform. The obtained coefficients are used to construct a matrix, where is the number of coefficients (typically is a large number) for each image. A mean feature vector of each class is calculated as the mean of its images. A second matrix is constructed where is the number of classes, i.e. each row represents the mean feature vector of a class. The standard deviation of the matrix is calculated. Each entry of the standard deviation vector represents how the corresponding column separates the classes. A column will be kept if the obtained standard deviation is bigger than a fixed threshold value, otherwise it is suppressed. The remaining columns are used as feature vector for each image.

The second phase is the testing phase; it uses a set of images to be classified. The feature vector of an image is calculated by decomposing the image using the wavelet or curvelet as discussed previously. Then, all columns are suppressed except the columns obtained from the building data phase. The image is then classified according to the smallest Euclidian distance between its vector and the classes' core vectors. The proposed feature extraction method is summarized in fig. 5.

4. Implementation

In the present study, a set of images provided by the MIAS is used to test the proposed technique. These images were previously investigated and labeled by an expert radiologist based on a technical experience and biopsy. This dataset is selected due to the various cases it includes. It is also widely used in similar research work (e.g. Mousa et al., 2005; Ferreira and Borges, 2003; Rashed et al., 2007). This dataset is composed of 322 mammograms of right and left breast, from 161 patients, where 51 were diagnosed as malignant, 64 as benign and 207 as normal. Table 1 presents the dataset distribution between different classes, benign and malignant.

Table 1 here

The original mammograms are 1024x1024 pixels, and almost 50% of the whole image is comprised of the background with lot of noise. A cropping operation is therefore applied to the images to cut off the unwanted portions of the images. ROIs 128x128 are cropped. The cropping process was performed manually. The centers of abnormalities area (given by experts) are selected to be the centers of the ROIs. In that way, no abnormality is suppressed with the background.

The data set was divided into two sets. The first set is used to construct the feature vectors, while the second set is used to test the proposed method. Wavelet and curvelet transforms are used to represent mammogram images as a pre-process for mammogram classification. In this work we use curvelet transform and daubechies-8 wavelet function (db8), with four decomposition levels, based on the previous works (Mousa et al., 2005; Rashed et al., 2007; Candes, 2004).

After obtaining the coefficient vector of each image, the mean of each class of images is calculated to produce a matrix where each row vector is the mean vector of a class. The standard deviation of the matrix of means of classes is calculated to produce a vector representing the standard deviations column by column. A threshold value is calculated using the formula presented by (Donoho, 1995) where is the length of the coefficients vector. A hard threshold is applied on the standard deviation vector. Table 2 presents the number of coefficients before and after applying the threshold.

Table 2 here

In this study three problems are addressed:

  1. The classification of normal versus abnormal class.
  2. The classification of abnormalities (i.e. microcalcification clusters, spiculated masses, circumscribed masses, ill-defined masses, architectural distortion and asymmetry).
  3. The classification of risk level of cancerous cells based tumor nature (i.e. benign versus malignant).

The classification step is performed using a nearest neighbor classifier. The classifier is using the Euclidian distance as a metric between the correspondent wavelet coefficients. For each class, a set of images are used to build the class core vector, each entry of the class core vector is calculated using equation 4.1.

Then the remaining images are classified by calculating the distance between the tested image and the class core vectors, as in equation 4.2. The system automatically classifies the tested image in the class for which the distance obtained is the smallest.

For comparison purposes, the method of biggest coefficients proposed in (Ferreira and Borges, 2003; Rashed et al., 2007) has been implemented as well. The method consists of extracting the biggest 100 coefficients from each decomposition level.

5. Results and Discussion

Table 3 presents the classification accuracy rates achieved in differentiating normal and abnormal classes. It shows that 100% of abnormal tissues have been detected, i.e. the true positive rate reached 100%. The detection of normal class reached 99.04%. Table 3 illustrates that wavelet outperforms curvelet in classification between normal and abnormal mammograms. The highest average classification accuracy rate for both classes is 99.52%, while it is 91.75% when using the biggest coefficients method.

Table 4 shows that all classes of abnormalities are correctly classified by using curvelet coefficients except the microcalcification class. The highest average classification accuracy rate for all classes is 98.72% achieved by using the curvelet coefficients. In this case the curvelet transform outperforms the wavelet transform. The performance of the proposed method is significantly higher than the biggest coefficients method.

Table 5 shows that the malignant class has been classified completely. The highest average of accuracy rate is 96.88% achieved by using the wavelet or curvelet coefficients, which is still higher than the highest average obtained by using the biggest coefficients method. Table 5 illustrates that wavelet and curvelet are giving the same accuracy to differentiate between benign and malignant.

Table 3 here

Table 4 here

Table 5 here

The obtained results indicate that the proposed method is a good technique for feature extraction. The comparison study proves that the proposed method is more efficient than the method of 100 biggest coefficients which was introduced earlier (Ferreira and Borges, 2003; Rashed et al., 2007).

The comparison between wavelet and curvelet shows that the wavelet outperforms the curvelet in distinguishing between abnormal and normal tissue, while curvelet shows efficiency in discriminating between the abnormalities classes. This supports the claim that curvelet transform provides stable, efficient and near-optimal representation of otherwise smooth objects having discontinuities along smooth curves (Soman, 2006).

6. Conclusion

A feature extraction method for finding the most significant coefficients was proposed and implemented to detect and classify breast cancer classes in mammogram images. This work focuses on using multiresolution representations advantages. The method is based on a maximization of the distances between the different classes and then a nearest neighbor classifier based on Euclidian distance is constructed. The classification accuracy rates achieved by the proposed method were 99.52% for normal versus abnormal, 98.72% for the classification of abnormality indicator, and 96.88% to determine whether the tumor is benign or malignant. The results of the proposed method were compared to the previous method developed in (Ferreira and Borges, 2003; Rashed et al., 2007). The comparison study indicated that the proposed method gives higher accuracy rates. The comparison between curvelet and wavelet showed that wavelet outperforms curvelet in classification of normal versus abnormal. The curvelet prove efficiency compared to wavelet in classification of abnormality indictor. Both wavelet and curvelet carried out the same results in classification between benign and malignant.


Al-Qdaha, M., Ramlib, A., Mahmud, R., 2005. A system of microcalcifications detection and evaluation of the radiologist: comparative study of the three main races in Malaysia. Computers in Biology and Medicine 35, 905-914.

Candes, E., 2004., CurveLab-2.1.2 Toolbox.

Candes, E., Demanet, L., Donoho, D., Ying, L., 2006. Fast discrete curvelet transforms. Multiscale Model. Simul. 5 (3), 861-899.

Candes, E., Donoho, D., 2000. Curvelets, multi-resolution representation, and scaling laws. Wavelet Applications in Signal and Image Processing VIII (4119-1), SPIE.

Cheng, H.D., Cai, X., Chen, X. Hu, L. Lou, X.,2003. Computer aided detection and classification of mocrlcalcification in mammograms: a survey. Pattern recognition 36, 2967-2991.

Christoyianni, I., Koutras, A., Dermatas, E., Kokkinakis, G., 2002. Computer aided diagnosis of breast cancer in digitized mammograms. Computerized Medical Imaging and Graphics 26, 309-319.

Donoho, D., 1995. De-noising by soft-thresholding. IEEE Transactions on Information Theory, 41(3), 613-627.

Eltoukhy, M.M., Faye, I., Belhaouari, S.B., 2009. Using curvelet transform to detect breast cancer in digital mammogram. proc. of The 5th International Colloquium on Signal Processing and its Application (CSPA) , Kuala Lumpur.

Ferreira, C.B.R., Borges, D.L., 2003. Analyses of mammogram classification using a wavelet transform decomposition. Pattern Recognition Letters 24, 973-982.

Fu, J.C., Lee, S.K., Wong, S.T.C., Yeh, J.Y., Wang, A.H., Wu, H.K. 2005. Image segmentation, feature selection and pattern classification for mammographic microcalcifications. Computerized Medical Imaging and Graphics 29, 419-429.

Liu, S., Babbs, C.F., Delp, E., 2001. Multiresolution detection of spiculated lesions in digital mammograms. IEEE Transactions on Image Processing 10 (6), 874-884.

Mallat, S.G., 1989. A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 7(11), 674-693.

Moayedi, F., Azimifar, Z., Boostani, R., Katebi, S., 2007. Contourlet based mammography mass classification. Lecture Notes in Computer Science 4633, 923-934.

Mousa, R., Munib, Q., Moussa, A., 2005. Breast cancer diagnosis system based on wavelet analysis and fuzzy-neural. Expert Systems with Applications 28, 713-723.

Ranjayyan, R., Ayers, F., Desautles, J., 2007. A review of computer aided diagnosis of breast cancer: Toward the detection of subtle signs. Franklin Institute 344, 312-348.

Rashed, E.A., Ismail, I.A., Zaki, S.I., 2007. Multiresolution mammogram analysis in multilevel decomposition. Pattern Recognition Letters 28, 286-292.

Soltanian-Zadeh, H., Rafiee-Rad, F., Pourabdollah-Nejad, S., 2004. Comparison of multiwavelet, wavelet, Haralick, and shape features for microcalcification classification in mammograms. Pattern Recognition 37 , 1973-1986.

Soman, K.P., Ramachandran, K.I., 2006. Insight into wavelets: from theory to Practice. Second Edition, Prentice-Hall.

Suckling, J., Parker, J., Dance, D., Astley, S., hutt, I., Boggies, C., et al., 1994. The mammographic image analysis society digital mammogram database. Exerpta Medica. International Congress Series 1069, 375-378.

Tang, J., Ranjayyan, R., El Naqa, I., Yang, Y., 2009. Computer aided detection and diagnosis of breast cancer with mammogram: recent advances. IEEE Transaction Information Technology in Biomedicine, 13 (2), 236-251.

Verma, B., Zhang, P., 2007. A novel neural-genetic algorithm to find the most significant combination of features in digital mammograms. Applied Soft Computing 7, 612-625.

WHO Cancer Facts Sheet, 2009. index.html