Malignant Thyroid Lesion Classification Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Ultrasound has great potential to aid in the differential diagnosis of malignant and benign thyroid lesions, but interpretative pitfalls exist and the accuracy is still poor. To overcome these difficulties, we developed and analyzed a range of knowledge representation techniques for characterizing the intra-nodular vascularization of thyroid lesions. The analysis is based on data obtained from twenty nodules (ten benign and ten malignant) taken from 3D contrast-enhanced ultrasound images. Fine needle aspiration biopsy and histology confirmed malignancy. Discrete Wavelet Transform (DWT) and texture algorithms are used to extract relevant features from the thyroid images. The resulting feature vectors are fed to three different classifiers: K-Nearest Neighbor (K-NN), Probabilistic Neural Network (PNN), and Decision Tree (DeTr). The performance of these classifiers is compared using Receiver Operating Characteristic (ROC) curves. Our results show that combination of DWT and texture features coupled with K-NN presented good results with the area of under the ROC curve of 0.987, a classification accuracy of 98.9%, a sensitivity of 99.8%, and a specificity of 98.1%. Finally, we have proposed a novel integrated index called Thyroid Malignancy Index (TMI) made up of DWT and texture features, to diagnose benign or malignant nodules using just one index. We hope that this TMI will help clinicians in a more objective detection of benign/malignant thyroid lesions.


The National Cancer Institute estimated the number of new thyroid cancer cases to be 44,670 and the predicted number of deaths due to this cancer to be 1,690 in 2010 (1). Another database documents a recent rise in the number of cases of thyroid carcinoma, with an estimated increase of 3% per year in the incidence of thyroid cancer (2, 3). Although the incidence of thyroid cancer appears to be increasing, the number of patients evaluated with a thyroid nodule without carcinoma remains far greater. Thyroid nodules are very common and may occur in more than 50% of adult population with about 7% of thyroid nodules being diagnosed as malignant (4). In general, the incidence of thyroid nodules is on the rise due to the wider use of neck imaging (5).

Such statistics indicate that there is an urgent need for cost-effective thyroid diagnosis support systems. Cost efficiency is important because a large number of tests must be performed in order to detect a relatively small number of cancer cases. In terms of diagnosis technique, Fine Needle Aspiration Biopsy (FNAB) is considered to be the "gold standard" technique for the diagnosis of thyroid nodules (6). However, FNAB is too labor intensive to be used for large scale screenings and has many pitfalls (7). A more effective detection strategy is to analyze medical images, because in this case the diagnosis process can be automated. Such systems are called as Computer Aided Diagnosis (CAD) systems. Use of thyroid images for the diagnosis is possible, because the image texture indicates the histopathologic components of the thyroid nodules. Now, having established the possibility of using thyroid images for cancer detection, we need to address the issue of cost efficiency. There is a wide range of medical imaging modalities available which could be used for thyroid nodule diagnosis. Ultrasound imaging is, by a large margin, the most cost effective of these modalities. Bastin et al. (8) have shown that ultrasound characteristics of thyroid nodules can predict the risk of malignancy (solid nodule, hypoechogenicity, microcalcification, macrocalcification, ill-defined margins, intranodular vascularity, and taller-than-wide shape). Unselected nodules without any suspicious ultrasound features showed a lower risk of malignancy (< 2%), whereas malignancy rates were much higher in nodules with at least two suspicious features. Recent guidelines endorsed this approach of using combinations of ultrasound features to guide nodule selection for fine needle aspiration. Chen et al. (9) have shown that among the numerous textural features used for the differential thyroid malignancy diagnosis, the sum average value reflected echogenicity and was able to differentiate between follicles and fibrosis base thyroid nodules. Fibrosis showed lowest echogenicity and lowest difference sum average value. Enlarged follicles showed highest echogenicity and the difference sum average values.

There is a wide range of different ultrasound imaging methods in existence. One of them is frequency encoded Doppler Ultrasound (DUS) imaging which has been used for the identification of flow in thyroid tumors. However, the role of DUS in the evaluation of thyroid nodules for malignancy has not yet been accurately studied (10). Internal flow without or with minimal peripheral flow on DUS and Resistance Index (RI) ≥ 0.70 were used to distinguish between malignant and benign thyroid nodules reliably. Nodules with prevailing peripheral vascularisation and minimal or no internal vascularisation, and RI below 0.70 were found to be probably benign. Doppler studies need a quantitative evaluation of the internal nodule flow to avoid subjective interpretations and partial visions caused by the bidimensional nature of the traditional High Resolution Ultrasound (HRUS). The sonographic features like the size and echogenicity of the tumors, the presence of cystic areas or calcifications, and detectable blood flow on color Doppler imaging of Hürthle Cell Neoplasms (HCNs) of the thyroid were studied (11). They concluded that the Hürthle cell neoplasms showed a spectrum of sonographic appearances from predominantly hypoechoic to hyperechoic lesions and from peripheral blood flow with no internal flow to extensively vascularized lesions. They also indicated that the differentiating benign and malignant HCNs was difficult using ultrasound and FNA techniques, and therefore, complete removal of the lesion is the only safe option. Contrast-Enhanced Ultrasound (CEUS) imaging was introduced to enhance the differential diagnosis of solitary thyroid nodules. The feasibility of CEUS imaging of the thyroid gland and the potential of this method for characterizing solitary thyroid nodules were studied (12). They assessed the baseline echogenicity and the dynamic enhancement pattern of each nodule, in comparison with adjacent thyroid parenchyma. Their results show that CEUS of thyroid gland was a feasible technique. However, overlapping findings seem to limit the potential of this technique in the characterization of thyroid nodules. Recently, enhancement patterns of thyroid nodules on gray-scale contrast-enhanced ultrasound were evaluated for the differential diagnosis (13). Their results show that CEUS enhancement patterns were different in benign and malignant lesions. Ring enhancement was predictive of benign lesions, whereas heterogeneous enhancement was helpful for detecting malignant lesions.

The application of ultrasonographic contrast agents that lead to an improvement in the differential diagnosis of thyroid nodules was studied (14). In the group of benign lesions, in the patients affected by nodular goiter, an intra-nodular perfusion as opposite to the healthy surrounding parenchyma was observed. Even though the ultrasound contrast agent technique has a limited invasivity and is more expensive than FNA, the preliminary data of this pilot study suggested that this method may be useful to differentiate benign from malignant thyroid nodules.

In this study, we used Discrete Wavelet Transform (DWT) and texture based feature extraction methods for the differential diagnosis of malignant and benign thyroid lesions. We used these features from CEUS images because this imaging method is cost-effective and more efficient in differentiating benign from malignant lesions. The following section presents the materials and methods used for feature extraction, classification, and statistical analysis. In the subsequent results section, both classification results and associated statistical analysis are discussed. The literature presented in this introduction section forms the basis for a discussion. The discussion section projects the proposed thyroid diagnosis technique into a wider perspective by comparing the results of the proposed system with previously published classification results. In the conclusion section of this paper, we highlight both the cost effectiveness and the accuracy of the proposed method.

Materials and Methods

Figure 1 shows the block diagram of the proposed system. In general, computer aided diagnosis systems can be constructed with a feature extraction subsystem and a classification subsystem. In this work, we used both DWT and texture based features. The extracted feature vectors were fed to one of the three classifiers: K-Nearest Neighbor (K-NN), Probabilistic Neural Network (PNN), and Decision Tree (DeTr). The role of each individual component in the block diagram is described in this section.

Insert Figure 1 here


Twenty patients with previously confirmed diagnosis of solitary thyroid nodule were enrolled in this study. Ten subjects were male (age: 53.5 ± 13.3 years; range: 22 - 71 years) and ten were female (age: 50.1 ± 10.8 years; range: 25 - 68 years). All the patients signed an informed consent prior to participating in the experiment. The experimental protocol was approved by the ethical committee of the Endocrinology Section of the "Umberto I'" Hospital of Torino (Italy).

All subjects underwent a clinical examination, hormonal profile, and ultrasound (B-Mode and Color Doppler) examination of the lesion. Then, 2.5 ml of ultrasound contrast agent (Sonovue, Bracco, Italy) was administered intravenously and a 3-D volume containing the lesion was acquired. Due to bulkiness and weight of external mechanical scanning systems and the variability associated with the nodules dimension and its position, we preferred to perform a freehand scanning. A trained operator with more than 30 years of experience in neck ultrasonography (R.G.) performed all the scans. The high frame rate of the device compared to the slow movement of the probe ensured that there was no gap between adjacent frames. The average frame rate of the device during acquisition was 16 Hz.

Images were acquired by a MyLab70 ultrasound scanner (Biosound-Esaote, Genova, Italy) equipped by a LA-522 linear probe working in the range 4-10 MHz. All the images were acquired at 10 MHz. The volumes were transferred in DICOM format to an external workstation (Apple PowerPc, dual 2.5 GHz, 8 G RAM) equipped with processing and reconstruction software.

All the subjects underwent ultrasound-guided FNAB of the thyroid lesion. Among the twenty nodules, ten were found to be malignant (six papillary, one follicular and one Hurtle cells carcinoma), and ten were benign (struma nodules). We acquired 40 data sets from each of the ten patients diagnosed with malignant nodules. Similarly, we acquired 40 data sets from each patient having benign thyroid nodules. Overall, our study contains 400 benign and 400 malignant data sets. The ten patients who were diagnosed with malignant nodules underwent thyroidectomy. The histo-pathological analysis confirmed the diagnosis of malignant carcinoma for all the ten patients. The results of the FNAB were used as reference for the benign nodules: all were struma nodules. Figures 2(a) and 2(b) show the typical benign and malignant thyroid images.

Insert Figure 2 here

Feature extraction

Feature extraction is one of the most important steps in automated CAD systems, because this step extracts relevant and representative features from measurement data such as images and signals. In this work, DWT and texture features were extracted from CEUS images.

DWT feature extraction: DWT is a useful and efficient tool for many image processing applications. DWT uses filter banks which are composed from finite impulse response filters (15). These filters are used for decomposing signals into low and high pass components. The low pass components contain information (in the form of coefficients) about slow varying signal characteristics, and the high pass components contain information about sudden changes in the signal. When DWT is applied to images, there are four different filtering possibilities:

Low pass filtering is performed on both rows and columns. The resulting LL coefficients contain most of the image's total energy.

Low pass filtering is performed on the rows, and high pass filtering on the columns. The resulting HL coefficients contain the vertical details of the image.

High pass filtering is applied to the rows, and low pass filtering to the columns. The resulting LH coefficients contain the horizontal details of the image.

High pass filtering is conducted on the rows and columns (HH coefficients). The resulting HH coefficients contain the diagonal details of the image, and they are the finest-scale wavelet coefficients.

Decomposition is further performed on the LL sub-band to attain the next coarser scale of wavelet coefficients.

Insert Figure 3 here

In our work, the CEUS images were first converted to a grayscale representation and then DWT was applied. Figure 3 shows the complete passband structure for a 2D sub-band transform with three levels. In this work, we have used Daubechies (Db) 8 as the mother wavelet.

The individual sub-bands are represented as matrixes and these matrixes are combined to form a feature. The method for combining the matrix elements is the same for all sub-band features. All the elements within the individual rows of the matrix are added and the elements of the resulting vector are squared before adding to form a scalar. Finally, this scalar is normalized by dividing it by the number of rows and columns of the original matrix. A2, H2, H1, V2, V1, D2, D1 (as shown Figure 3) indicate the A2, H2, H1, V2, V1, D2, D1 (as shown in Figure 1).

Texture feature extraction: Texture features measure smoothness, coarseness, and regularity of pixels which form an image. These measures describe a mutual relationship among intensity values of neighboring pixels repeated over an area larger than the size of the relationship (16). There are two common approaches to texture analysis: statistical analysis and structural analysis. In the statistical approach, scalar measurements of the textures are obtained. This approach characterizes textures as smooth, coarse, or grainy etc. These methods are based on both distributions and relationships between intensity values of pixels. Measures include entropy, contrast, and correlation based on the gray level co-occurrence matrix. Structural texture analysis is more complex when compared to the statistical approach (17). It presents detailed symbolic descriptions of the image. Parameters that are extracted using the statistical approach are more suitable for image analysis than those obtained using the structural method (18). In this section, the statistical parameters extracted from the CEUS images are briefly described. The Gray Level Co-occurrence Matrix (GLCM) of an M Ã- N image I is defined (15) by


where , and denotes the cardinality of a set. The probability of a pixel with a grey level value i having a pixel with a gray level value j at a distance away in an image is


Based on the above mentioned, we obtain the following features:

, [3]

, and [4]

. [5]

The homogeneity feature measures the similarity between two pixels that are apart. Denseness and degree of disorder in an image are measured by energy and entropy features. In general, the entropy feature will have a maximum value when all elements of the co-occurrence matrix are the same. The symmetry projections indicate prominent directions within the texture of CEUS images, and therefore, symmetry is an important discriminative feature of these images.

Classifiers used

There are three classifiers used in this work, namely K-Nearest Neighbor (K-NN), Probabilistic Neural Network (PNN), and Decision Tree (DeTr). They are briefly described in this section.

K-Nearest Neighbor (K-NN): K-NN is based on the minimum distance from a query instance to the training samples. The K-nearest neighbors are determined using this method. After gathering these K-nearest neighbors, the majority of them are used for the prediction (19).

Probabilistic Neural Network (PNN): PNN is a specific type of two layer radial basis network which is often used for classification. The first layer of neurons in a PNN has radial basis activation functions. This layer computes the distance vector by evaluating the distances between the input and training vectors. The second layer (competitive layer) sums the contributions of each input classes and produces a vector of probabilities as the output of the input classes. The so called compete transfer function, at the output of the second layer, selects the maximum of these probabilities and assigns a 1 for the selected class and a 0 for all other classes (20).

Decision Tree (DeTr): DeTr classifier generates a tree and a set of rules to represent the model in order to identify different classes from a given data. The rules can be used to recognize the unknown data (21).

Statistical analysis

The student's t- test is a form of regression analysis used to assess whether the two groups have different means on some measure.  If there is less than 5% chance of getting the observed differences by chance, then a statistically significant difference between the two groups is reported.  The lower 'p' values indicate that these groups are clinically significant.

The Receiver Operating Characteristic (ROC) curve is a plot in a two dimensional space. The x-axis is `1 - specificity' and the y-axis is `sensitivity'. Sensitivity, also known as true positive fraction, refers to the probability that a test result is positive when a disease is present. The Area under the ROC curve (AUC) indicates the classifier performance across the entire range of cut-off points. Conventionally, the area under the ROC curve must fall in the range between 0.5 and 1 (22). An area closer to one indicates that the classifier has a better accuracy. The area under the ROC curve is a good indicator for the classifier performance (23).

Thyroid Malignancy Index (TMI)

In this work, we have used Entropy, Homogeneity and Symmetry texture features to develop an integrated index TMI. It is difficult to track how these three texture features vary in a patient for making an appropriate diagnosis. Hence, we have formulated an integrated index by combining these features in such a way that the index is distinct for benign and malignant nodules. The TMI is defined as follows.


Such an integrated index would help in a faster and more objective detection of benign and malignant thyroid nodules.


We have used 740 images for training and 80 images for testing. Ten-fold stratified cross validation method was used to test the classifiers. Using this technique, the whole dataset was split into ten equal parts (roughly). Nine parts of the data (training set) were used for classifier development and the built classifier was evaluated using the remaining one part (test set) (i.e. 760 images were used for training and 40 images for testing each time). This procedure was repeated ten times using a different part as the test set in each case. Average of the accuracy, sensitivity, specificity, positive predictive accuracy, and AUC was calculated for all ten folds to obtain the overall performance measures. The range of features, classification results, and the range of TMI are given in the following sections.

Table I documents the results of statistical analysis of the DWT and texture features. The last column of this table shows the p-value of the features. The fact that all p-values are below 0.0001 indicates that all features are clinically significant. The homogeneity, symmetry and all the DWT features are higher for malignant nodules compared to benign because benign images (Figure 2(a)) have more structure compared to malignant thyroid images (Figure 2(b)). The images with more structure, such as the benign thyroid images, have more variations in the grayscale values compared to the malignant thyroid images, and therefore have higher entropy values.

Insert Table I here

Table II presents the classification results obtained by using the extracted DWT and texture features in the three classifiers. The first column indicates the classifier used. The next four columns present the average number of True Negatives (TN), False Negatives (FN), True Positives (TP), and False Positives (FP) obtained over the ten folds. The average classification accuracy is shown in column 6. Columns 7, 8, and 9 show the average values of the sensitivity, specificity, and AUC, respectively. The classification accuracy for all three tested classifiers is well above 96%. It is also evident from the results that the K-NN classifier performs better than DeTr and PNN with a higher accuracy of 98.9%. Figure 4 shows the ROC curves of the three classifiers. It can be clearly seen from the figure that, the K-NN performs better than the other two classifiers.

Insert Table II here

Insert Figure 4 here

Thyroid malignancy index results

Table III shows the TMI values (mean standard deviation) for the two classes. It can be seen from the table that they are distinctly different from each other without any overlap. Figure 5 shows the box plot of the mean value of TMI indices highlighting the separation between the two classes clearly.

Insert Table III here

Insert Figure 5 here


The recent advances in ultrasound techniques have paved the way for many investigators to propose new imaging algorithms to diagnose the malignancy in thyroid carcinoma. Ultrasound methods for thyroid cancer diagnosis are cost-effective, and these methods perform as good as other thyroid cancer diagnosis methods. In this section, we present the comparison of the results obtained using our technique and other techniques in the literature which also aim to diagnose malignant thyroid nodules.

Finley et al. (24) classified benign and malignant thyroid nodules using molecular profiling. In their study, they carried out cluster analysis using 62 samples from two classes (benign and malignant). The results of their study show sensitivity and specificity of 91.7% and 96.2%, respectively. Cerutti et al. (25) proposed a pre-operative diagnostic method to distinguish benign and malignant thyroid carcinoma based on gene expression. A total thyroidectomy was the treatment of choice, and a negative result was confirmed on permanent pathology in 20 cases. The immunohistochemistry correctly classified 29 of 32 fine-needle aspirations (90.6%) and 23 of 27 follicular thyroid adenomas (85.2%). The authors of this study were not satisfied with both sensitivity and specificity, and therefore, they proposed further work to increase both measures.

Patton et al. (26) differentiated between malignant and benign solitary thyroid nodules by fluorescent scanning. They demonstrated a sensitivity of 93.8% in the identification of cancer. However, the accuracy in the distinction between benign and malignant tissues was only 77.0%. Regardless of the cost, the classification accuracy was low for state of the art CAD systems. The B-mode sonographic images of inflamed and healthy tissues were differentiated automatically using texture features (27). A classification success rate of 100% was achieved with as few as one optimal feature among the 129 texture characteristics tested. The stability of the results with respect to sonograph setting, thyroid gland segmentation and scanning direction was tested.  In this work, authors have studied the normal and inflamed ultrasound images.

Thus, based on the above facts, we felt the necessity for a better technique that can improve the classification efficiency and that is also more economical. In our study, we proposed a CAD system for the detection of benign and malignant thyroid lesions from ultrasound images. Our proposed method is simple and does not involve intensive computation. We have proposed a novel integrated index called Thyroid Malignancy Index (TMI) that can be used to identify benign and malignant conditions with high accuracy. Moreover, we used the extracted features in classifiers and concluded that a combination of DWT and texture parameters coupled with a simple K-NN classifier can be used for automated classification. Our proposed system is able to identify the unknown class with an accuracy, sensitivity and specificity of more than 96%. In addition to this, the proposed TMI is distinct for each of the two classes, and therefore, can help in faster, easier, more cost-effective, and more objective detection of benign and malignant lesions.


There is a need for the cost-efficient biomedical diagnostic support systems. In this work, we have investigated the performance of the proposed CEUS based thyroid cancer CAD system using texture and DWT parameters. The extracted DWT and texture features were fed as input to the three different classifiers to compare their performances. Our results show that the combination of DWT and texture features coupled with K-NN classifier presented a classification accuracy of 98.9%, sensitivity of 99.8% and 98.1% specificity.

In order to make the differentiation faster and more objective, we have gone one step further and formulated a non-dimensional integrated index (given by Eqn. 6) that is composed of texture features. Based on the information presented in Table III and Figure 5, it is evident that this integrated TMI Index can be employed for the diagnosis of benign and malignant nodules effectively. The advantage of this Integrated Index is the fact that, in order to make a diagnosis, the physician needs to only look at the value of just one integrated index instead of checking the range of each individual feature. Hence, this TMI can be used as an adjunct tool for the clinicians to cross check their diagnosis.